python - Pandas 与每日间隔的值的最大差异
问题描述
df
Timestamp Spot DK1 Spot DK2 Ubalance DK1 Ubalance DK2
0 2020-01-01T00:00:00+01:00 33.42 33.42 34.00 34.00
1 2020-01-01T01:00:00+01:00 31.77 31.77 34.00 34.00
2 2020-01-01T02:00:00+01:00 31.57 31.57 34.00 34.00
3 2020-01-01T03:00:00+01:00 31.28 31.28 34.00 34.00
4 2020-01-01T04:00:00+01:00 30.85 30.85 26.00 26.00
5 2020-01-01T05:00:00+01:00 30.14 30.14 25.00 25.00
6 2020-01-01T06:00:00+01:00 30.17 30.17 24.00 24.00
7 2020-01-01T07:00:00+01:00 30.00 30.00 24.00 24.00
8 2020-01-01T08:00:00+01:00 30.63 30.63 24.00 24.00
9 2020-01-01T09:00:00+01:00 30.59 30.59 25.00 25.00
10 2020-01-01T10:00:00+01:00 30.27 30.27 25.00 25.00
11 2020-01-01T11:00:00+01:00 30.34 30.34 25.00 25.00
12 2020-01-01T12:00:00+01:00 30.59 30.59 30.59 30.59
13 2020-01-01T13:00:00+01:00 30.04 30.04 30.04 30.04
14 2020-01-01T14:00:00+01:00 30.60 30.60 30.60 30.60
15 2020-01-01T15:00:00+01:00 31.09 31.09 31.09 31.09
16 2020-01-01T16:00:00+01:00 31.53 31.53 31.53 31.53
17 2020-01-01T17:00:00+01:00 31.78 31.78 33.50 33.50
18 2020-01-01T18:00:00+01:00 31.64 31.64 33.50 33.50
19 2020-01-01T19:00:00+01:00 31.44 31.44 31.44 31.44
20 2020-01-01T20:00:00+01:00 31.35 31.35 31.35 31.35
21 2020-01-01T21:00:00+01:00 31.07 31.07 31.07 31.07
22 2020-01-01T22:00:00+01:00 30.96 30.96 25.00 25.00
23 2020-01-01T23:00:00+01:00 30.61 30.61 21.00 21.00
24 2020-01-02T00:00:00+01:00 30.78 30.78 20.00 20.00
25 2020-01-02T01:00:00+01:00 30.64 30.64 20.00 20.00
26 2020-01-02T02:00:00+01:00 30.43 30.43 20.00 20.00
27 2020-01-02T03:00:00+01:00 28.79 28.79 23.00 23.00
28 2020-01-02T04:00:00+01:00 28.42 28.42 22.73 22.73
29 2020-01-02T05:00:00+01:00 28.75 28.75 23.00 23.00
30 2020-01-02T06:00:00+01:00 33.38 34.16 22.50 22.50
31 2020-01-02T07:00:00+01:00 31.79 42.07 22.28 22.28
32 2020-01-02T08:00:00+01:00 31.83 44.89 22.50 22.50
33 2020-01-02T09:00:00+01:00 31.74 45.26 23.00 23.00
34 2020-01-02T10:00:00+01:00 31.63 45.57 24.00 24.00
35 2020-01-02T11:00:00+01:00 31.32 45.09 25.00 25.00
36 2020-01-02T12:00:00+01:00 31.07 45.16 25.00 25.00
37 2020-01-02T13:00:00+01:00 31.06 44.90 25.00 25.00
38 2020-01-02T14:00:00+01:00 31.07 44.06 26.00 26.00
39 2020-01-02T15:00:00+01:00 31.26 44.84 26.00 26.00
40 2020-01-02T16:00:00+01:00 31.41 44.40 27.50 27.50
41 2020-01-02T17:00:00+01:00 31.40 46.05 26.00 46.05
42 2020-01-02T18:00:00+01:00 31.10 46.72 26.00 26.00
43 2020-01-02T19:00:00+01:00 30.75 45.26 25.32 25.32
44 2020-01-02T20:00:00+01:00 30.47 39.32 20.25 20.25
45 2020-01-02T21:00:00+01:00 30.10 30.10 16.50 16.50
46 2020-01-02T22:00:00+01:00 29.71 29.71 16.50 16.50
47 2020-01-02T23:00:00+01:00 24.99 24.99 15.00 15.00
48 2020-01-03T00:00:00+01:00 18.93 18.93 15.00 15.00
49 2020-01-03T01:00:00+01:00 9.98 9.98 9.98 9.98
我想生成一组单独的值,其中每天的值的最大差异在其中。最好能够选择要包含的列。
因此,该系列中的第一个值应该是 12.5,因为第一天的最大值和最小值分别是 33.5 和 21.00。
预期结果:
Day Max diff
0 2020-01-01 12.50
1 2020-01-02 31.72
到目前为止,我已经尝试过这个 - 但这给了我一个滚动的 24 小时间隔,跨越天,这是我试图避免的:
cols = ['Spot DK1','Spot DK2','Ubalance DK1','Ubalance DK2']
N = 24
battery['dif'] = (battery[cols].stack(dropna=False)
.rolling(len(cols) * N)
.agg(lambda x: x.max() - x.min())
.groupby(level=0)
.max())
解决方案
想法是每天聚合最小值和最大值,因此可能 getmax
和min
bymax
和min
level indf1
和 last 减去 new df2
:
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
cols = ['Spot DK1','Spot DK2','Ubalance DK1','Ubalance DK2']
df1 = (df.set_index('Timestamp')[cols]
.groupby(pd.Grouper(freq='D', level='Timestamp'))
.agg(['min','max']))
s1 = df1.xs('max', level=1, axis=1).max(axis=1)
s2 = df1.xs('min', level=1, axis=1).min(axis=1)
df2 = s1.sub(s2).rename_axis('Day').reset_index(name='Max diff')
print (df2)
Day Max diff
0 2020-01-01 00:00:00+01:00 13.00
1 2020-01-02 00:00:00+01:00 31.72
2 2020-01-03 00:00:00+01:00 8.95
细节:
print (df1)
Spot DK1 Spot DK2 Ubalance DK1 \
min max min max min max
Timestamp
2020-01-01 00:00:00+01:00 30.00 33.42 30.00 33.42 21.00 34.0
2020-01-02 00:00:00+01:00 24.99 33.38 24.99 46.72 15.00 27.5
2020-01-03 00:00:00+01:00 9.98 18.93 9.98 18.93 9.98 15.0
Ubalance DK2
min max
Timestamp
2020-01-01 00:00:00+01:00 21.00 34.00
2020-01-02 00:00:00+01:00 15.00 46.05
2020-01-03 00:00:00+01:00 9.98 15.00
推荐阅读
- youtube - YouTube API V3 查找不在播放列表中的频道视频
- kubernetes - 如何使用 Kubernetes 服务发现进行端口发现?
- sql - Oracle Reyclebin 不释放空间
- javascript - 使用代理浏览器同步注入 css 文件
- java - java.lang.NoSuchMethodError: org.springframework.core.annotation.AnnotationUtils.clearCache()V
- checkbox - 更改时取消禁用 for 循环中的复选框
- javascript - 在 Angular 5 中访问函数
- google-apps-script - 扫描一列中的单元格以获取特定文本值,然后在另一列的同一行中输入不同的文本值
- ionic-framework - 我如何在 Model Ionic 中监听关闭事件
- go - 连续向多个通道发送数据