首页 > 解决方案 > 计算每分钟的列值总和

问题描述

我有一个包含 4 列的数据框,例如:

Date        Time        Val1     Val2
17-01-2019  09:15:03     -       -118 
17-01-2019  09:15:03     2       -   
17-01-2019  09:15:05     75      -   
17-01-2019  09:15:06     -       -59 
17-01-2019  09:15:06     -       -41 
17-01-2019  09:15:08     64      -   
17-01-2019  09:15:08     -       -39 
17-01-2019  09:15:10     17      -   
17-01-2019  09:15:10     15      -   
17-01-2019  09:15:56     25      -   
17-01-2019  09:15:56     -       -70 
17-01-2019  09:15:57     -       -9 
17-01-2019  09:15:59     5       -   
17-01-2019  09:15:59     19      -   
17-01-2019  09:16:01     26      -   
17-01-2019  09:16:01     -       -7 
17-01-2019  09:16:02     23      -   
17-01-2019  09:16:03     -       -5 
17-01-2019  09:16:05     -       -73 
17-01-2019  09:16:52     -       -15 
17-01-2019  09:16:53     -       -   
17-01-2019  09:16:53     -       -11 
17-01-2019  09:16:55     -       -20 
17-01-2019  09:16:57     -       -21 
17-01-2019  09:16:57     4       -   
17-01-2019  09:16:58     17      -   
17-01-2019  09:17:00     -       -7 
17-01-2019  09:17:00     -       -2 
17-01-2019  09:17:02     -       -41 
17-01-2019  09:17:02     21      -   
17-01-2019  09:17:04     10      -   
17-01-2019  09:17:54     -       -14 
17-01-2019  09:17:54     -       -17 
17-01-2019  09:17:56     -       -8 
17-01-2019  09:17:57     -       -16 
17-01-2019  09:17:57     -       -3 
17-01-2019  09:17:59     -       -31 
17-01-2019  09:17:59     5       -   
18-01-2019  09:15:01     15      45
18-01-2019  09:15:08     24      56
18-01-2019  09:16:23     12      24  
18-01-2019  09:16:01     19      41

我需要每分钟计算 Val1 和 Val2 列的总和。我的意思是对09:1509:1609:17等的所有数据(Val1 和 Val2)进行分组和添加

我希望我的结果为:

Date        Time        Val1     Val2
17-01-2019  09:15:00     222    -336
17-01-2019  09:16:00     70     -152
17-01-2019  09:17:00     36     -139
18-01-2019  09:15:00     39      101
18-01-2019  09:16:00     31      65

标签: pythonpython-3.xpandasdatetime

解决方案


DatetimeIndex使用DataFrame.popand创建to_datetime,然后将所有列转换为数字,并使用to_numeric调用:DataFrame.resamplesum

df.index = pd.to_datetime(df.pop('Date') + ' ' + df.pop('Time'))
df = df.apply(pd.to_numeric, errors='coerce').resample('1Min').sum()

print (df)
                      Val1   Val2
2019-01-17 09:15:00  222.0 -336.0
2019-01-17 09:16:00   70.0 -152.0
2019-01-17 09:17:00   36.0 -139.0

如有必要,DatetimeIndex可使用DataFrame.insertDatetimeIndex.strftime下列:

df.insert(0, 'Date', df.index.strftime('%d-%m-%Y'))
df.insert(1, 'Time', df.index.strftime('%H:%M:%S'))
#alternative
#df.insert(0, 'Date', df.index.date)
#df.insert(1, 'Time', df.index.time)
df  = df.reset_index(drop=True)
print (df)
         Date      Time   Val1   Val2
0  17-01-2019  09:15:00  222.0 -336.0
1  17-01-2019  09:16:00   70.0 -152.0
2  17-01-2019  09:17:00   36.0 -139.0

推荐阅读