python-3.x - 根据日期熊猫计算
问题描述
我有这个数据框:
a = [1, 2, 3, 4, 5]
b = ['2019-08-01', '2019-09-01', '2019-10-23', '2019-11-12', '2019-11-30']
c = [12, 0, 0, 0, 0]
d = [0, 23, 0, 0, 0]
e = [12, 24, 35, 0, 0]
f = [0, 0, 44, 56, 82]
g = [21, 22, 17, 75, 63]
df = pd.DataFrame({'ID': a, 'Date': b, 'Unit_sold_8': c,
'Unit_sold_9': d, 'Unit_sold_10': e, 'Unit_sold_11': f,
'Unit_sold_12': g})
df['Date'] = pd.to_datetime(df['Date'])
我想计算基于日期的每个 ID 的平均销售额。例如,如果ID的开放日期是9月,那么这个ID的平均销售额将从9月开始。我试过np.select
但我意识到这种方法会使我的代码超长。
col = df.columns
mask1 = (df['Date'] >= "08/01/2019") & (df['Date'] < "09/01/2019")
mask2 = (df['Date'] >= "09/01/2019") & (df['Date'] < "10/01/2019")
mask3 = (df['Date'] >= "10/01/2019") & (df['Date'] < "11/01/2019")
mask4 = (df['Date'] >= "11/01/2019") & (df['Date'] < "12/01/2019")
mask5 = (df['Date'] >= "12/01/2019")
condition2 = [mask1, mask2, mask3, mask4, mask5]
result2 = [df[col[2:]].mean(skipna = True, axis = 1),
df[col[3:]].mean(skipna = True, axis = 1),
df[col[4:]].mean(skipna = True, axis = 1),
df[col[5:]].mean(skipna = True, axis = 1),
df[col[6:]].mean(skipna = True, axis = 1)]
df.loc[:, 'Mean'] = np.select(condition2, result2, default = np.nan)
有什么方法可以更快地解决这个问题吗?尤其是当时间范围扩大时(12 个月、24 个月等)
解决方案
它对你有帮助吗?
from datetime import datetime
import numpy as np
from dateutil import relativedelta
check_date = datetime.today()
df['n_months'] = df['Date'].apply(lambda x: relativedelta.relativedelta( check_date,x).months)
df['total'] = df.iloc[:,range(2,df.shape[1]-1)].sum(axis=1)
df['avg'] = df['total'] / df['n_months']
print(df)
ID Date Unit_sold_8 ... n_months total avg
0 1 2019-08-01 12 ... 5 45 9.00
1 2 2019-09-01 0 ... 4 69 17.25
2 3 2019-10-23 0 ... 3 96 32.00
3 4 2019-11-12 0 ... 2 131 65.50
4 5 2019-11-30 0 ... 2 145 72.50
推荐阅读
- dialogflow-es - Dialogflow v2 用户输入和执行
- angularjs - How to change other element classes by clicking on an element inside ng-repeat
- python - (Py)Qt5:如何在以编程方式设置当前项目时更新选择
- python-3.x - pytest:为什么我的模拟函数没有被调用
- laravel - UpdateOrCreate not working has expected
- java - Eclipse:如何找到枚举文字?
- r - 如何将 jpeg 徽标添加到基本图形图的顶部 lhs 外边距?
- python - 在我可以覆盖的 semilogx 图中获取 matplotlib 的刻度标签
- xaml - XAML 文本框 - 未调用 IValueConverter ConvertBack
- r - 更多麻烦在 ubuntu 16.04 上安装 rgdal