python - Pandas Dataframe:在 12 小时内删除带有 DateTime 的行
问题描述
我有这个数据框:
对于每个 IMEI,我想检查是否DATETIME OF LVD
在 12 小时内发生任何后续。如果他们这样做,他们需要被删除。
例如,在此 df 行中,需要删除 1、6、13、14、15 行。
IMEI DATETIME OF LVD
0 864811031001402 2018-10-04 23:50:00
1 864811031001402 2018-10-05 04:35:00
2 864811031001402 2018-10-15 03:40:00
3 864811031001402 2018-10-21 04:25:00
4 866710038341548 2018-10-27 05:53:00
5 864811031092336 2018-10-17 18:10:00
6 864811031092336 2018-10-17 18:41:00
7 864811031092336 2018-10-21 04:50:00
8 864811031092336 2018-10-23 03:21:00
9 864811031092336 2018-10-24 03:00:00
10 864811031009041 2018-10-13 21:52:00
11 864811031009041 2018-10-27 11:13:00
12 864811031015584 2018-10-27 00:48:00
13 864811031015584 2018-10-28 05:25:00
14 864811031015584 2018-10-28 05:26:00
15 864811031015584 2018-10-28 05:27:00
我可以获得每条记录的增量时间差(如下),但是如何为每个 IMEI 组做呢?
df['Delta'] = pd.to_datetime(df['DATETIME OF LVD']).diff()
IMEI DATETIME OF LVD Delta
0 864811031001402 2018-10-04 23:50:00 NaT
1 864811031001402 2018-10-05 04:35:00 0 days 04:45:00
2 864811031001402 2018-10-15 03:40:00 9 days 23:05:00
3 864811031001402 2018-10-21 04:25:00 6 days 00:45:00
4 866710038341548 2018-10-27 05:53:00 6 days 01:28:00
5 864811031092336 2018-10-17 18:10:00 -10 days +12:17:00
6 864811031092336 2018-10-17 18:41:00 0 days 00:31:00
7 864811031092336 2018-10-21 04:50:00 3 days 10:09:00
8 864811031092336 2018-10-23 03:21:00 1 days 22:31:00
9 864811031092336 2018-10-24 03:00:00 0 days 23:39:00
10 864811031009041 2018-10-13 21:52:00 -11 days +18:52:00
11 864811031009041 2018-10-27 11:13:00 13 days 13:21:00
12 864811031015584 2018-10-27 00:48:00 -1 days +13:35:00
13 864811031015584 2018-10-28 05:25:00 1 days 04:37:00
14 864811031015584 2018-10-28 05:26:00 0 days 00:01:00
15 864811031015584 2018-10-28 05:27:00 0 days 00:01:00
解决方案
通过 2 个布尔掩码使用DataFrameGroupBy.diff
和过滤- 与按位boolean indexing
检查缺失行的 Timedelta 链式比较:|
OR
df['DATETIME OF LVD'] = pd.to_datetime(df['DATETIME OF LVD'])
s = df.groupby('IMEI')['DATETIME OF LVD'].diff()
df = df[(s > pd.Timedelta('12 hour')) | s.isna()]
print (df)
IMEI DATETIME OF LVD
0 864811031001402 2018-10-04 23:50:00
2 864811031001402 2018-10-15 03:40:00
3 864811031001402 2018-10-21 04:25:00
4 866710038341548 2018-10-27 05:53:00
5 864811031092336 2018-10-17 18:10:00
7 864811031092336 2018-10-21 04:50:00
8 864811031092336 2018-10-23 03:21:00
9 864811031092336 2018-10-24 03:00:00
10 864811031009041 2018-10-13 21:52:00
11 864811031009041 2018-10-27 11:13:00
12 864811031015584 2018-10-27 00:48:00
13 864811031015584 2018-10-28 05:25:00
详情:
print (s)
0 NaT
1 0 days 04:45:00
2 9 days 23:05:00
3 6 days 00:45:00
4 NaT
5 NaT
6 0 days 00:31:00
7 3 days 10:09:00
8 1 days 22:31:00
9 0 days 23:39:00
10 NaT
11 13 days 13:21:00
12 NaT
13 1 days 04:37:00
14 0 days 00:01:00
15 0 days 00:01:00
Name: DATETIME OF LVD, dtype: timedelta64[ns]
推荐阅读
- ml.net - 边界框问题 - 使用 ML.Net 进行预测
- csv - 用于将 txt 转换为 CSV 的批处理文件删除空格和逗号转换为列
- html - 如何在悬停时实现这种“下划线”效果,用完了选项
- amazon-web-services - AWS Redshift Data API 为 executeStatement 使用了错误的端口
- ios - 播放歌曲时播放/暂停按钮不更新
- javascript - 连接到我的商店时收到错误消息!*第一次尝试实现 Redux*
- vue.js - 让文本框在更改时变为红色
- javascript - 将全局附加到 Vue 实例
- node.js - 在 Firebase Admin 中验证无密码电子邮件注册
- intellij-idea - 如何在 IntelliJ 中过滤 UML 图以仅显示特定类及其传递依赖项?