python - Cumsum Pandas groupby 两列
问题描述
我有这个数据框
H = Home win
D = Draw
A = Away win
Datetime HomeTeam AwayTeam HG AG FT
0 2021-02-17 22:00:00 Colo Colo U. De Concepcion 1 0 H
1 2021-02-15 14:30:00 Cobresal U. Espanola 4 1 H
2 2021-02-14 22:00:00 Deportes Iquique S. Wanderers 2 0 H
3 2021-02-14 22:00:00 La Serena A. Italiano 0 2 A
4 2021-02-14 22:00:00 O'Higgins Colo Colo 1 1 D
... ... ... ... ... ... ...
我想将每排比赛的主场和客场胜利相加。
代码
#Creating Bool columns for cumsum
df['HomeWin'] = df['HG'] > df['AG']
df['Draw'] = df['HG'] == df['AG']
df['HomeLoss'] = df['HG'] < df['AG']
#Calculating previous wins of home team except current row
home_sum = df.groupby('HomeTeam')['HomeWin'].apply(lambda x: x.shift(fill_value=0).rolling(99,min_periods=1).sum())
#Calc previous matches of home team except current row
home_count = (df.groupby('HomeTeam')['Win'].apply(lambda x: x.shift(fill_value=0).rolling(99,min_periods=1).sum()) + df.groupby('HomeTeam')['Draw'].apply(lambda x: x.shift(fill_value=0).rolling(99,min_periods=1).sum()) + df.groupby('HomeTeam')['HomeLoss'].apply(lambda x: x.shift(fill_value=0).rolling(99,min_periods=1).sum()))
#Calculating previous wins of away team
away_sum = df.groupby('AwayTeam')['HomeLoss'].cumsum()
#Calc previous matches of away team
away_count = df.groupby('AwayTeam')['HomeLoss'].cumsum() + df.groupby('AwayTeam')['Draw'].cumsum() + df.groupby('AwayTeam')['HomeWin'].cumsum()
print(away_count)
df['SUM'] = (home_sum + away_sum) / (home_count + away_count)
输出
Datetime HomeTeam AwayTeam HG AG FT 1 X 2 SUM
0 2021-02-17 22:00:00 Colo Colo U. De Concepcion 1 0 H 2.53 3.01 2.80 0.285714
Home_sum = 6
Home_count = 17
Away_sum = 4
Away_count = 18
df['SUM'] = (6 + 4) / (17 + 18)
EXPECTED OUTPUT
Home_sum = 6
Home_count = 17
Away_sum = 3
Away_count = 17
df['SUM'] = (6 + 3) / (17 + 17)
我有一个问题,它不计算同一支球队的比赛,而是同一行的球队。在示例中,错误是它考虑了 AwayTeam 列中的 U. De Concepcion 而不是 Colo colo 的值
解决方案
推荐阅读
- python - 使用转换器提升 Python 问题 - 静态链接
- cassandra - 将 Janusgraph 从 0.2.2 升级到 0.5.2
- html - 垂直对齐 mat-checkbox 和 mat-form-field | 角材料 | 角
- azure-cosmosdb - 是否建议在 cosmosdb 中有大量的逻辑分区
- linux - 如何通过shell脚本向csv文件的第一列添加值
- ruby - 从标准输入加载 gemspec
- python-3.x - 如果我使用动作属性,Django 表单不会保存
- c - 在scanf中读取字符'-'
- asp.net-core - AddWebhookNotification 调用Controller中的方法
- html - 使用 Crispy Forms 渲染 django 的 Multiwidget 和 MutliValueField 文本区域