首页 > 解决方案 > Python:如何从列中减去时间戳并创建一个新的 TimeElapsed 列?

问题描述

我有几列dataframe看起来像这样:

ContextID Time_ms
    1   09:12:48.502
    1   09:12:48.603
    1   09:12:48.934
    2   09:15:36.434
    2   09:15:36.654
    3   09:17:55.940
    3   09:17:56.160
    3   09:17:57.267

我想做的是TimeElapsed为每个创建一个名为(最好包含以毫秒为单位的值)的新列ContextID,它必须包含如下值:

ContextID   Time_ms Time_Elapsed
1   09:12:48.502    0
1   09:12:48.603    09:12:48.603 - 09:12:48.502
1   09:12:48.934    09:12:48.934 - 09:12:48:502 
2   09:15:36.434    0
2   09:15:36.654    09:15:36.654 - 09:15:36.434
3   09:17:55.940    0
3   09:17:56.160     09:17:55.940 -09:17:55.940
3   09:17:57.267    09:17:57.267 - 09:17:55.940

each的第一个值必须是Time_ms0secs ContextID,然后Time_ms必须从第一个值中减去第二个值,Time_ms依此类推,差值必须填满Time_Elapsed列。

我想知道如何在 python 中使用 Pandas 来实现。

谢谢

标签: pythonpython-3.xpandas

解决方案


减去groupby+的结果transform

#df['Time_ms'] = pd.to_timedelta(df.Time_ms)
df['Time_Elapsed'] = df.Time_ms - df.groupby('ContextID').Time_ms.transform('first')

   ContextID         Time_ms    Time_Elapsed
0          1 09:12:48.502000        00:00:00
1          1 09:12:48.603000 00:00:00.101000
2          1 09:12:48.934000 00:00:00.432000
3          2 09:15:36.434000        00:00:00
4          2 09:15:36.654000 00:00:00.220000
5          3 09:17:55.940000        00:00:00
6          3 09:17:56.160000 00:00:00.220000
7          3 09:17:57.267000 00:00:01.327000

变换用于将 groupby 结果广播回原始的形状DataFrame。在这种情况下,我们需要第一个值,因此我们可以执行单次减法:

df.groupby('ContextID').Time_ms.transform('first')

#0   09:12:48.502000
#1   09:12:48.502000
#2   09:12:48.502000
#3   09:15:36.434000
#4   09:15:36.434000
#5   09:17:55.940000
#6   09:17:55.940000
#7   09:17:55.940000
#Name: Time_ms, dtype: timedelta64[ns]

推荐阅读