python - 如何获取值以及具有该特定值的行数,该特定值源自熊猫数据帧上的多个条件?
问题描述
我有熊猫数据框
Id drove swimmed walked winPerc
0 247.3 1050 782.4 1
1 37.65 1072 119.6 0.04
2 93.73 1404 3248 1
3 95.88 1069 21.49 0.1146
4 0 1034 640.8 0
5 128.1 1000 1016 0.9368
average 100.4433333 1104.833333 971.3816667
Min 0 1000 21.49
max 247.3 1404 3248`
winPerc = 1 意味着玩家以第一名获胜,同样 winPerc = 0 告诉我们玩家排在最后
print("The person who ends up winning the match usually drives {:.2f} , swims {:.2f} meters, has a walked {} meters".format(df.set_index('drove')['winPerc'].idxmax(),df.set_index('swimmed')['winPerc'].idxmax(),df.set_index('walked')['winPerc'].idxmax()))
为此,我得到:-
IndexError:元组索引超出范围
我想要的是如您在上面的数据框中看到的那样,id 为 0 和 2 的行具有 winPerc = 1 我应该得到如下响应:
The person who ends up winning the match usually drives 170.52 , swims 1227 meters, has a walked 2015.2 meters
如果有多个记录 winPerc =1 那么我应该得到相应的值
也可能有可能没有驾驶过的玩家(驾驶 = 0),并且,
赢得比赛 (winPerc = 1)
print("{} number of confident Players won without driving".format(len(df['drove'].min()['winPerc'].idxmax())))
为此,我收到此错误:-
IndexError:标量变量的索引无效。
如果没有行的列值为 min() 或 max() 或 mean(),那么我应该采用接近该特定情况的值的值。
提前感谢,如果我需要解释更多,请告诉我。:)
解决方案
我复制了第一个打印件而没有更改任何内容,它对我来说很好用:
The person who ends up winning the match usually drives 247.30 , swims 1050.00 meters, has a walked 782.4 meters
.
当您使用.format()
并获取IndexError: tuple out of range
它时,意味着您使用太少的变量调用它。
对于第二个问题,您需要过滤您的DataFrame
. 这可以通过不同的方式来完成,使用布尔掩码是一种常见的方式。
>> drove_is_0 = df["drove"] == df['drove'].min()
>> is_winner = df['winPerc'] == df['winPerc'].idxmax()
然后将您的过滤器应用于您的DataFrame
:
>> filtered = df[drove_is_0 & is_winner]
最后打印:
>> print("{} number of confident Players won without driving".format(len(filtered)))
1 number of confident Players won without driving
OP 澄清说,第一个问题不是关于提出的IndexError
,而是关于过滤的。他们想要过滤值所在df
的列,然后计算不同列的值。为了保持一致性,我将使用如上所示的布尔掩码:winPerc
1
mean
>> is_winner = df["winPerc"] == 1
>> mean_driven_winner = df[is_winner]["drove"].mean()
>> mean_swimmed_winner = df[is_winner]["swimmed"].mean()
>> mean_walked_winner = df[is_winner]["walked"].mean()
>> print("The person who ends up winning the match usually drives {:.2f} , swims {:.2f} meters, has a walked {} meters".format(
mean_driven_winner, mean_swimmed_winner, mean_walked_winner)
)
The person who ends up winning the match usually drives 170.52 , swims 1227.00 meters, has a walked 2015.2 meters