python - 是否可以旋转数据框并获得每月汇总?
问题描述
我有一个客户购买的数据框:
customer shop amount local_date
0 John WALLMART 1.50 2019-04-10
1 John WALLMART 40.79 2019-05-10
2 John LIDL 2.64 2019-08-18
3 John WALLMART 29.17 2019-02-18
4 John LIDL 42.69 2019-07-22
5 John WALLMART 1.50 2019-09-16
6 John WALLMART 40.79 2019-09-17
7 Mary WALLMART 2.64 2019-05-08
8 Mary LIDL 29.17 2019-02-07
9 Mary WALLMART 28.23 2019-02-21
10 Mary ALDI 8.84 2019-10-15
11 Mary WALLMART 5.59 2019-03-23
12 Mary LIDL 53.09 2019-01-03
13 Mary LIDL 46.03 2019-02-03
14 Mary WALLMART 84.17 2019-10-18
15 Paul LIDL 4.63 2019-02-21
16 Paul WALLMART 19.82 2019-02-13
17 Paul ALDI 19.02 2019-12-12
18 Paul LIDL 41.88 2019-06-25
19 Paul ALDI 37.79 2019-12-18
我可以旋转它并获得每个商店每个客户的总和:
df.pivot_table(values='amount', index=['customer'], columns=['shop'], aggfunc='sum').reset_index().fillna(0)
shop customer ALDI LIDL WALLMART
0 John 0.00 45.33 113.75
1 Mary 8.84 128.29 120.63
2 Paul 56.81 46.51 19.82
我怎样才能得到他们每月在每家商店的消费金额?
我已经尝试了一些东西,我打算将它们转换为我需要的格式
# this makes no sense to me
df.set_index('local_date').groupby([pd.Grouper(freq='M'),'customer','shop'])['amount'].sum()
local_date customer shop
2019-01-31 Mary LIDL 53.09
2019-02-28 John WALLMART 29.17
Mary LIDL 75.20
WALLMART 28.23
Paul LIDL 4.63
WALLMART 19.82
2019-03-31 Mary WALLMART 5.59
2019-04-30 John WALLMART 1.50
2019-05-31 John WALLMART 40.79
Mary WALLMART 2.64
2019-06-30 Paul LIDL 41.88
2019-07-31 John LIDL 42.69
2019-08-31 John LIDL 2.64
2019-09-30 John WALLMART 42.29
2019-10-31 Mary ALDI 8.84
WALLMART 84.17
2019-12-31 Paul ALDI 56.81
我还通过分组创建了一个数据框dt.month
,然后旋转它,但我最终得到了与我开始时相同的数据透视表:
# create dataframe grouped by monthly sum
newd = df.groupby([df.local_date.dt.month,'customer','shop'])['amount'].sum().to_frame()
#pivoting
newd.pivot_table(values='amount', index=['customer'], columns=['shop'], aggfunc='sum').reset_index().fillna(0)
shop customer ALDI LIDL WALLMART
0 John 0.00 45.33 113.75
1 Mary 8.84 128.29 120.63
2 Paul 56.81 46.51 19.82
解决方案
groupby
月份to_period
:
df.groupby([df['local_date'].dt.to_period('M'),'customer','shop'])['amount'].sum()
输出:
local_date customer shop
2019-01 Mary LIDL 53.09
2019-02 John WALLMART 29.17
Mary LIDL 75.20
WALLMART 28.23
Paul LIDL 4.63
WALLMART 19.82
2019-03 Mary WALLMART 5.59
2019-04 John WALLMART 1.50
2019-05 John WALLMART 40.79
Mary WALLMART 2.64
2019-06 Paul LIDL 41.88
2019-07 John LIDL 42.69
2019-08 John LIDL 2.64
2019-09 John WALLMART 42.29
2019-10 Mary ALDI 8.84
WALLMART 84.17
2019-12 Paul ALDI 56.81
Name: amount, dtype: float64
如果你想要shop
as 列,你可以unstack
:
(df.groupby([df['local_date'].dt.to_period('M'),'customer','shop'])['amount']
.sum().unstack('shop', fill_value=0)
)
输出:
shop ALDI LIDL WALLMART
local_date customer
2019-01 Mary 0.00 53.09 0.00
2019-02 John 0.00 0.00 29.17
Mary 0.00 75.20 28.23
Paul 0.00 4.63 19.82
2019-03 Mary 0.00 0.00 5.59
2019-04 John 0.00 0.00 1.50
2019-05 John 0.00 0.00 40.79
Mary 0.00 0.00 2.64
2019-06 Paul 0.00 41.88 0.00
2019-07 John 0.00 42.69 0.00
2019-08 John 0.00 2.64 0.00
2019-09 John 0.00 0.00 42.29
2019-10 Mary 8.84 0.00 84.17
2019-12 Paul 56.81 0.00 0.00
推荐阅读
- python - 如何将数组绘制为热图时间序列
- amazon-web-services - 哪些对登录到 aws 控制台?
- google-sheets - 当日期在单元格 A 中输入时如何自动定义值在单元格 B 中输入
- c++ - 绑定 wxEVT_CHAR_HOOK 在 wxWidgets 3.1.2 中不起作用
- json - 如何以 vba 形式从 excel 单元格中搜索 UPC 条形码?
- json - 使用 jolt 仅提取所需的 JSON 数据
- react-native - React Native FlatList 不滚动
- jquery - 如何仅为单击的元素切换类
- typescript - 如何运行在 TypeScript 中包含接口 inter.ts 的 call.ts
- angular - 尽管数据库已成功更新,但删除行后表不会刷新