python - 如何计算具有特定属性的元素从不同系列
问题描述
我需要从运动员事件.csv 中计算参加夏季和冬季奥运会的运动员的百分比。
我已经尝试为每个运动员分配值,但我继续陷入无限循环。数据显示如下:
Name Sex Age Height Weight Team NOC Games Year Season City Sport Event Medal
A Dijiang M 24 180 80 China CHN 1992 Summer 1992 Summer Barcelona Basketball Basketball Men's Basketball NA
没有实际的错误消息,只是一个无限循环
df= pd.read_csv(r"C:\Users\Rorro\Desktop\desafio latam\athlete_events.csv")
pjt = df.loc[:,"Name"]
pjt = pjt.drop_duplicates()
temp = df.loc[:,["Name","Season"]]
total = 0
for i in pjt:
for l,r in temp.iterrows():
if i == r["Name"] and r["Season"] == "Winter":
for n,m in temp.iterrows():
if i == m["Name"] and m["Season"] == "Summer":
total+=1
else:
pass
elif i == r["Name"] and r["Season"] == "Summer":
for n,m in temp.iterrows():
if i == m["Name"] and m["Season"] == "Winter":
total+=1
else:
pass
else:
continue
打印(总计)
解决方案
那这个呢?
df = pd.DataFrame({'Season': ['winter', 'summer', 'winter', 'summer'],
'Name' : ['a', 'b', 'c', 'a'],
'Year' : [1992, 1996, 2004, 2000]})
print(df)
# Defines the wanted seasons
selection = (df['Season'] == 'summer') | (df['Season'] == 'winter')
# Defines the wanted years
selection = selection & (df['Year'].isin([1992, 1996, 2000]))
names = df[selection]['Name']
print(names)
unique_count = len(names.unique())
print("\nDistinc itens: {}".format(unique_count))
印刷:
Season Name Year
0 winter a 1992
1 summer b 1996
2 winter c 2004
3 summer a 2000
0 a
1 b
3 a
Name: Name, dtype: object
Distinc itens: 2
推荐阅读
- lua - 在真棒 WM 中刷新图像内存
- php - 在表、键和值中打印多维数组
- pandas - 如何在 SageMaker 笔记本终端中更新 pandas 版本?
- java - 如何删除计划作业的特定触发器
- c - C中 malloc() 函数的神秘行为
- parallel-processing - 函数式编程对并行计算有何好处?
- xslt - 将一长段重复的 XSLT 分配给变量
- asp.net-core - 从 C# 生成有效的 __RequestVerificationToken
- docker - 如果我使用绑定挂载,为什么我还需要在我的 Dockerfile 中执行 COPY?
- python - Python - 从 2 个集合中查找最接近的索引