python - 如何根据来自其他两列的值的分组总和创建新的值列?
问题描述
首先,如果我的问题措辞不够好,我深表歉意。
我将使用一个示例数据框来说明我的问题。
medals = pd.DataFrame({'Year':[2010,2010,2010,2010,2010,2010,2014,2014,2014,2014,2014,2014,2018,2018,2018,2018,2018,2018],'Country': ['Canada','Canada','USA','USA','Germany','Germany','Canada','Canada','USA','USA','Germany','Germany','Canada','Canada','USA','USA','Germany','Germany'],'Sex': ['female','male','female','male','female','male','female','male','female','male','female','male','female','male','female','male','female','male'],
'No. of medals': [2,4,2,0,3,0,1,1,3,2,4,4,1,3,2,2,1,3]})
假设我有这个国家数据框和他们在奥运会上获得的奖牌数量:
Year Country Sex No. of medals
0 2010 Canada female 2
1 2010 Canada male 4
2 2010 USA female 2
3 2010 USA male 0
4 2010 Germany female 3
5 2010 Germany male 0
6 2014 Canada female 1
7 2014 Canada male 1
8 2014 USA female 3
9 2014 USA male 2
10 2014 Germany female 4
11 2014 Germany male 4
12 2018 Canada female 1
13 2018 Canada male 3
14 2018 USA female 2
15 2018 USA male 2
16 2018 Germany female 1
17 2018 Germany male 3
假设我想添加一列,显示该国当年获得的奖牌总数:
Year Country Sex No. of medals Total medals
0 2010 Canada female 2 6
1 2010 Canada male 4 6
2 2010 USA female 2 2
3 2010 USA male 0 2
4 2010 Germany female 3 3
5 2010 Germany male 0 3
6 2014 Canada female 1 2
7 2014 Canada male 1 2
8 2014 USA female 3 5
9 2014 USA male 2 5
10 2014 Germany female 4 8
11 2014 Germany male 4 8
12 2018 Canada female 1 4
13 2018 Canada male 3 4
14 2018 USA female 2 4
15 2018 USA male 2 4
16 2018 Germany female 1 4
17 2018 Germany male 3 4
我该怎么做呢?我已经按国家和年份分组并得到总和,但我不确定如何将其映射到年份和国家列。
medals.groupby(['Year','Country'])['No. of medals'].sum()
给了我这个:
Year Country
2010 Canada 6
Germany 3
USA 2
2014 Canada 2
Germany 8
USA 5
2018 Canada 4
Germany 4
USA 4
Name: No. of medals, dtype: int64
非常感谢任何提示和指示。谢谢!
解决方案
利用groupby
transform
medals['Total medals']=medals.groupby(['Country','Year'])['No. of medals'].transform('sum')
print(medals)
Year Country Sex No. of medals Total medals
0 2010 Canada female 2 6
1 2010 Canada male 4 6
2 2010 USA female 2 2
3 2010 USA male 0 2
4 2010 Germany female 3 3
5 2010 Germany male 0 3
6 2014 Canada female 1 2
7 2014 Canada male 1 2
8 2014 USA female 3 5
9 2014 USA male 2 5
10 2014 Germany female 4 8
11 2014 Germany male 4 8
12 2018 Canada female 1 4
13 2018 Canada male 3 4
14 2018 USA female 2 4
15 2018 USA male 2 4
16 2018 Germany female 1 4
17 2018 Germany male 3 4
推荐阅读
- css - 需要解决文本顺风定位
- python - 四舍五入到两位(不是小数位)python
- elasticsearch - 用于生产环境的 2 节点弹性搜索集群
- mysql - 内连接的执行速度是否与 where 子句一样快?
- c# - `Model.plans.Count` 引发了“System.NullReferenceException”类型的异常
- mysql - 如何在mysql的select语句中复用多列的值
- maximo - 在 Maximo 查找字段中显示相关值?
- c - 虽然当我尝试比较 2 个 int 值(1 个来自用户输入,1 个来自数据 .txt 文件)时不起作用(学生记录项目)
- linux - Docker:从 CLI 运行容器失败,但不是从 docker 桌面运行,不允许 RTENETLINK 操作
- javascript - 如何在不打开新浏览器窗口的情况下使用链接打开外部应用程序?