python - 如何使用时间序列中的值创建时间间隔数据框?
问题描述
我有一个 CSV 文件,其中包含一长串血糖 (BG) 值和相关时间戳。我正在尝试使用 BG < 3.5 的所有间隔创建一个数据框(最后称为 df2)。我可以使用这些值创建初始 df,然后得到:
Timestamp Glucose
0 2020-02-24 17:45:23 4.7
1 2020-02-24 17:50:23 4.9
2 2020-02-24 17:55:22 4.9
3 2020-02-24 18:00:22 4.8
4 2020-02-24 18:05:21 4.7
... ... ...
2348 2020-03-03 19:25:38 4.8
2349 2020-03-03 19:30:38 4.7
2350 2020-03-03 19:35:38 4.7
2351 2020-03-03 19:40:38 4.5
2352 2020-03-03 19:45:38 4.2
2353 rows × 2 columns
然后我使用下面的代码来尝试生成间隔。然而,它只给了我持续 5 分钟的间隔(1 个值长度)。我认为这是因为我有代码index+1
来关闭我的代码,current_interval
而我需要的是一个看起来的循环,index+1 to index+len(time_series)
但我不知道如何做到这一点。非常感谢任何帮助。下面的代码:
THRESHOLD = 3.5
IntervalRow = namedtuple(
'IntervalRow',
['start_time', 'start_bg', 'end_time', 'end_bg', 'lowest_bg']
)
def is_hypo(value):
return value < THRESHOLD
def calculate_hypo_intervals(time_series):
intervals = []
current_interval = None
for index in range(len(time_series)):
if is_hypo(time_series['Glucose'][index]):
if not current_interval:
current_interval = IntervalRow(
start_time=time_series['Timestamp'][index],
start_bg=time_series['Glucose'][index],
end_time=None,
end_bg=None,
lowest_bg=time_series['Glucose'][index],
)
if index+1 < len(time_series) and current_interval.lowest_bg > time_series['Glucose'][index+1]:
current_interval = IntervalRow(
start_time=current_interval.start_time,
start_bg=current_interval.start_bg,
end_time=None,
end_bg=None,
lowest_bg=time_series['Glucose'][index+1],
)
if index+1 < len(time_series) and not is_hypo(time_series['Glucose'][index+1]):
intervals.append(
IntervalRow(
start_time=current_interval.start_time,
start_bg=current_interval.start_bg,
end_time=time_series['Timestamp'][index+1],
end_bg=time_series['Glucose'][index+1],
lowest_bg=current_interval.lowest_bg,
)
)
# I appreciate this bit is probably not very code savvy and is only there for the final data point.
# suggestions to mix it with the if loop above welcomed. Reason I seperated it was because if I
# left it as before where it read "if index == len(time_series) and not is_hypo" then either all
# intervals have to end with a value that is still hypo or you get an Index error
if index+1 == len(time_series):
intervals.append(
IntervalRow(
start_time=current_interval.start_time,
start_bg=current_interval.start_bg,
end_time=time_series[index].timestamp,
end_bg=time_series['Glucose'][index],
lowest_bg=current_interval.lowest_bg
)
)
current_interval = None
df2 = pd.DataFrame(intervals, columns =['Start Time', 'Start BG', 'End Time', 'End BG', 'Lowest BG'])
return df2
这给了我以下信息,但不包括(例如)比第一个间隔更早的非常低的 BG < 3.5 插曲。如您所见,所有间隔只有 5 分钟(下一个值)。谢谢!!
Start Time Start BG End Time End BG Lowest BG
0 2020-02-25 10:10:23 3.1 2020-02-25 10:15:24 3.6 3.1
1 2020-02-25 11:05:23 3.4 2020-02-25 11:10:23 3.7 3.4
2 2020-02-25 14:35:25 3.1 2020-02-25 14:40:25 3.5 3.1
3 2020-02-25 18:25:26 3.3 2020-02-25 18:30:26 3.9 3.3
4 2020-02-27 09:45:20 3.4 2020-02-27 09:50:20 3.6 3.4
5 2020-02-27 12:50:19 3.4 2020-02-27 12:55:19 3.6 3.4
6 2020-02-27 17:35:20 3.4 2020-02-27 17:40:19 3.6 3.4
7 2020-02-28 10:05:22 3.4 2020-02-28 10:10:22 3.5 3.4
8 2020-02-28 18:35:23 3.4 2020-02-28 18:40:24 3.6 3.4
9 2020-02-29 11:15:26 3.4 2020-02-29 11:20:26 3.5 3.4
10 2020-02-29 16:15:27 3.4 2020-02-29 16:20:27 3.5 3.4
11 2020-02-29 21:10:28 3.4 2020-02-29 21:15:27 3.5 3.4
12 2020-03-01 13:55:31 3.4 2020-03-01 14:00:30 3.6 3.4
13 2020-03-01 17:45:29 3.4 2020-03-01 17:50:31 3.5 3.4
14 2020-03-02 12:45:34 3.3 2020-03-02 12:50:34 3.6 3.3
15 2020-03-02 16:30:34 3.4 2020-03-02 16:35:34 3.5 3.4
16 2020-03-03 17:50:38 3.4 2020-03-03 17:55:38 3.5 3.4
解决方案
推荐阅读
- python - 用三个列表值填充文本每个列表项填充一个文本并在其上循环
- nestjs - NestJS 身份验证策略 - 它是如何访问的?
- python - 带有 Python Flask 的 OpenCV 如何从文件夹中读取图像并将它们流式传输到网站?
- javascript - 何时使用 Second Bracket 何时不在 javascript 中?
- jwt - 无法从听众那里得到令牌
- html - 继承的网站,无法在 Dreamweaver 中编辑 HTML 代码
- r - 修复它们的 Rlang 详细信息
- visual-studio - 未构建新的 ASP.NET Core 项目;“找不到框架‘Microsoft.NETCore.App’,版本‘2.0.9’。”
- bash - MacOS 更快地安装 Homebrew 和软件包
- google-colaboratory - 发生各种崩溃后 Google Colab 单元仍在运行