python - 按列计算时间差 Pandas
问题描述
我有 column df['Status']
,其中有一些对象:
In: df.Status.unique()
Out: array([nan, 'Open', 'Plmt', 'SHRT', 'Check'], dtype=object)
柱子:
In: df['Status']
Out: time Status
2016-01-15 08:55:00 Open
2016-01-15 09:00:00 Plmt
2016-01-15 09:05:00 Plmt
2016-01-15 09:10:00 Plmt
2016-01-15 09:15:00 Plmt
2016-01-15 09:20:00 Plmt
2016-01-15 09:25:00 Plmt
2016-01-15 09:30:00 Plmt
2016-01-15 09:35:00 Plmt
2016-01-15 09:40:00 SHRT
哪里time
是:
df.index = df['time']
df.index = pd.to_datetime(df.index)
我想跳过我不需要的值('Plmt'、'Check'、'nan'),创建新列 df['Diff'],以分钟为单位的差异在哪里'Open' 'SHRT'
。
我尝试这样:
df['Status'][df['Status'] == 'SHRT'] - df['Status'][df['Status'] == 'Open']
但在输出接收 NaN 值:
time
2016-01-15 08:55:00 NaN
2016-01-15 09:40:00 NaN
2016-01-18 08:30:00 NaN
2016-01-19 14:30:00 NaN
2016-01-19 14:35:00 NaN
2016-01-20 11:10:00 NaN
2016-01-20 11:45:00 NaN
预期的输出必须如下所示:
time Status Diff
2016-01-15 08:55:00 Open NaN
2016-01-15 09:40:00 SHRT 00:45:00
2016-02-15 10:00:00 Open NaN
2016-02-15 14:15:00 SHRT 02:15:00
如何获得时间差,有人可以帮忙吗?
解决方案
采用:
#changed data samples for better sample data
print (df)
time Status
0 2016-01-15 08:55:00 Open
1 2016-01-15 09:00:00 Plmt
2 2016-01-15 09:05:00 SHRT
3 2016-01-15 09:10:00 Plmt
4 2016-01-15 09:15:00 Open
5 2016-01-15 09:20:00 Plmt
6 2016-01-15 09:25:00 SHRT
7 2016-01-15 09:30:00 SHRT
8 2016-01-15 09:35:00 Plmt
9 2016-01-15 09:40:00 SHRT
#filter only Open and SHRT
df1 = df[df['Status'].isin(['Open','SHRT'])].copy()
#convert column to datetimes
df1['time'] = pd.to_datetime(df1['time'])
print (df1)
time Status
0 2016-01-15 08:55:00 Open
2 2016-01-15 09:05:00 SHRT
4 2016-01-15 09:15:00 Open
6 2016-01-15 09:25:00 SHRT
7 2016-01-15 09:30:00 SHRT
9 2016-01-15 09:40:00 SHRT
#filter only rows with Open and next row SHRT
m1 = (df1['Status'] == 'Open') & (df1['Status'].shift(-1) == 'SHRT')
m2 = (df1['Status'].shift() == 'Open') & (df1['Status'] == 'SHRT')
df2 = df1[m1 | m2].copy()
#create difference column and set NaT by condition
df2['Diff'] = df2['time'].diff().mask(df2['Status'] == 'Open')
print (df2)
time Status Diff
0 2016-01-15 08:55:00 Open NaT
2 2016-01-15 09:05:00 SHRT 00:10:00
4 2016-01-15 09:15:00 Open NaT
6 2016-01-15 09:25:00 SHRT 00:10:00
推荐阅读
- javascript - 如何在 R Shiny 中打印带有绘图和数据表的活动 tabPanel?
- powershell - 在 ForEach-Object 循环中使用作业收集数据
- css - 将图像放置在 div 中的 CSS 样式: display: table;position: relative
- r - h2o 中的留一法交叉验证
- android - 如何验证 OnReceivedHttpAuthRequest 中的用户名/密码?
- c# - 如何在对两个元素求和时调整数组的大小?
- clojure - 如何从数据创建规范
- ruby-on-rails - 无法检索 Shrine 元数据
- odatalib - 在哪里可以找到 Microsoft.AspNet.OData 的发行说明
- mysql - 无法使用该表上的条件调用更新表