首页 > 解决方案 > diff() 函数输出 - 时间序列格式

问题描述

我无法格式化 diff() 函数的时间单位输出。

这是代码:

import pandas as pd
from numpy import random

df = pd.DataFrame(data = random.randn(5,4), index = ['A','B','C','D','E'],
columns = ['W','X','Y','Z'])

df['W'] = ['10/01/2018 12:00:00','10/03/2018 13:00:00',
           '10/03/2018 12:30:00','10/04/2018 12:05:00',
           '10/08/2018 12:00:15']

df['W']=pd.to_datetime(df['W'])
df['delta']=df['W'].diff()
df

这就是我得到的('delta' 列):

    W           X           Y           Z           delta
A   2018-10-01  0.218683    1.704266    1.035627    NaT
B   2018-10-03  -1.362903   1.251404    -0.296558   2 days 01:00:00
C   2018-10-03  1.288930    -1.692359   1.185029    -1 days +23:30:00
D   2018-10-04  1.355021    1.144945    -1.294918   0 days 23:35:00
E   2018-10-08  -0.572535   0.236500    -0.435992   3 days 23:55:15

这就是我想在“delta”列中获得的内容:

    W           X           Y           Z           delta
A   2018-10-01  0.218683    1.704266    1.035627    NaT
B   2018-10-03  -1.362903   1.251404    -0.296558   2.04
C   2018-10-03  1.288930    -1.692359   1.185029    -0.02
D   2018-10-04  1.355021    1.144945    -1.294918   0.98
E   2018-10-08  -0.572535   0.236500    -0.435992   3.99

有任何想法吗?

谢谢您的帮助!

标签: python-3.xpandas

解决方案


将 timedeltas 转换为天数,.Series.dt.total_seconds除以和 last :8640060 * 60 *24round

df['delta']=df['W'].diff().dt.total_seconds().div(86400).round(2)
print (df)
                    W         X         Y         Z  delta
A 2018-10-01 12:00:00  0.821455  1.481278  1.331864    NaN
B 2018-10-03 13:00:00  0.685609  0.573761  0.287728   2.04
C 2018-10-03 12:30:00  0.953490 -1.689625 -0.344943  -0.02
D 2018-10-04 12:05:00 -0.514984  0.244509 -0.189313   0.98
E 2018-10-08 12:00:15  0.464802  0.845930 -0.503542   4.00

推荐阅读