python - Python:基于日期中的月份的数据透视表
问题描述
我想根据日期中的一个月来旋转我的表格。
预期的示例结果:
jul,revenue,aug,revenue,sept,revenue,okt,revenue,nov,revenue,dec,revenue,jan,revenue,feb,revenue,mar,revenue,apr,revenue,may,revenue,jun,revenue
Jul-2017,1000,Aug-2017,1000,Sep-2017,1000,Oct-2017,1000,Nov-2017,1000,Dec-2017,1000,Jan-2018,1000,Feb-2018,1000,Mar-2018,1000,Apr-2018,1000,May-2018,1000,Jun-2018,1000
Jul-2018,1000,Aug-2018,1000,Sep-2018,1000,Oct-2018,1000,Nov-2018,1000,Dec-2018,1000,Jan-2019,1000,Feb-2019,1000,Mar-2019,1000,Apr-2019,1000,May-2019,1000,Jun-2019,1000
Jul-2019,1000,Aug-2019,1000,Sep-2019,1000,Oct-2019,1000,Nov-2019,1000,Dec-2019,1000,Jan-2020,1000,Feb-2020,1000,Mar-2020,1000,Apr-2020,0,May-2020,0,Jun-2020,0
我用这段代码试了一下:
import pandas as pd
from datetime import date
df = pd.read_excel (r'examplefile')
ndf = df[['Saleprice', 'Date' , 'Season', 'Area', 'Place', 'Bookingsnumber']].reset_index(drop=True)
ndf['Date'] = pd.to_datetime(ndf['Date'])
#Revenue per Year/Month
ndf['my'] = ndf['Date'].map(lambda x: x.strftime('%m/%Y'))
grouped_ndf = ndf.groupby('my').sum().reset_index()
print(grouped_ndf)
控制台给了我这个结果:
my Saleprice
0 01/2019 2000
1 01/2020 3000
2 02/2019 4000
3 02/2020 6000
示例文件:
Saleprice,Date,Season,Area,Place,Bookingsnumber,Buy
1000,1-1-2019,winter 2019/2020,Zillertal,Mayrhofen,C20015243,800
1000,3-2-2019,winter 2019/2020,Zillertal,Königsleiten,B20015245,800
1000,5-2-2019,winter 2019/2020,Zell am See / Kaprun,Zell am See,C20015246,800
1000,1-1-2020,winter 2019/2020,Zell am See / Kaprun,Zell am See,D20015248,800
1000,3-2-2020,winter 2019/2020,Les Trois Vallées,Val Thorens,C20015249,800
1000,5-2-2020,winter 2019/2020,Zillertal,Königsleiten,C20015251,800
1000,1-1-2019,winter 2019/2020,Zillertal,Königsleiten,C20015252,800
1000,3-2-2019,winter 2019/2020,Espace Killy - Tignes,Tignes-Les-Brévières,C20015250,800
1000,5-2-2019,winter 2019/2020,Zillertal,Gerlos,B20015253,800
1000,1-1-2020,winter 2020/2021,Les Trois Vallées,Les Menuires,C20015254,800
1000,3-2-2020,winter 2020/2021,Zell am See / Kaprun,Zell am See,B20015255,800
1000,5-2-2020,winter 2020/2021,Zell am See / Kaprun,Zell am See,D20015256,800
1000,1-1-2020,winter 2020/2021,Les Trois Vallées,Les Menuires,E20015254,800
1000,3-2-2020,winter 2020/2021,Zell am See / Kaprun,Zell am See,B20015255,800
1000,5-2-2020,winter 2020/2021,Zell am See / Kaprun,Zell am See,B20015256,800
我必须改变什么才能得到我的示例结果?期待你的回复。
解决方案
这段代码完成了这项工作,它不是最 Pythonic 的代码,但这是我能想到的全部:
df['Date'] = df['Date'].apply(lambda x: pd.to_datetime(x))
months = df['Date'].apply(lambda x: x.strftime('%b')).unique()
df['my'] = df['Date'].map(lambda x: x.strftime('%b/%Y'))
dfgrouped = df.groupby('my')['Saleprice'].sum().reset_index()
month_dfs = [*map(lambda x: dfgrouped[dfgrouped['my'].str[:3]==x].reset_index(drop=True).rename(columns={'my': x, 'Saleprice': 'revenue'}), months)]
new_df = pd.concat(month_dfs, axis=1)
请注意 pd.to_datetime 将 Date 列中的第一个数字视为月份,如果您希望第二个数字是月份,则必须执行以下操作:
from datetime import datetime
df['Date'] = df['Date'].apply(lambda x: datetime.strptime(x, '%d-%m-%Y'))
months = df['Date'].apply(lambda x: x.strftime('%b')).unique()
df['my'] = df['Date'].map(lambda x: x.strftime('%b/%Y'))
dfgrouped = df.groupby('my')['Saleprice'].sum().reset_index()
month_dfs = [*map(lambda x: dfgrouped[dfgrouped['my'].str[:3]==x].reset_index(drop=True).rename(columns={'my': x, 'Saleprice': 'revenue'}), months)]
new_df = pd.concat(month_dfs, axis=1)
如果要对月份进行排序:
from datetime import datetime
df['Date'] = df['Date'].apply(lambda x: datetime.strptime(x, '%m-%d-%Y'))
months_in_df = months = df['Date'].apply(lambda x: x.strftime('%b')).unique()
months = []
for i in range(1,13):
if datetime.date(2020, i, 1).strftime('%b') in months_in_df:
months.append((i, datetime.date(2020, i, 1).strftime('%b')))
df['my'] = df['Date'].map(lambda x: x.strftime('%b/%Y'))
dfgrouped = df.groupby('my')['Saleprice'].sum().reset_index()
month_dfs = [*map(lambda x: dfgrouped[dfgrouped['my'].str[:3]==x].reset_index(drop=True).rename(columns={'my': x, 'Saleprice': 'revenue'}), months)]
new_df = pd.concat(month_dfs, axis=1)
推荐阅读
- c++ - 返回无符号字符数组的值会引发 SIGABRT 错误
- ios - Firebase App crashed on launch when made active from background
- python - 遇到ValueError,如何解决?
- angular - Angular 6:Kendo-Schduler 中事件的不同背景颜色
- python - 如何仅获取具有特定条件的数据帧的第二个索引
- python - 查找常见项目数量python的快速方法
- css - 低 Z 指数中的高 Z 指数
- bash - 如何以读写权限打开已编译的应用程序?
- python - 列出所有“.csv”文件名,然后输入相应的代码来绘制图形
- typescript - 如何使用局部变量引用类实例?