首页 > 解决方案 > 如何获取值以及具有该特定值的行数,该特定值源自熊猫数据帧上的多个条件?

问题描述

我有熊猫数据框

Id  drove   swimmed walked  winPerc
0   247.3   1050    782.4   1
1   37.65   1072    119.6   0.04
2   93.73   1404    3248    1
3   95.88   1069    21.49   0.1146
4   0       1034    640.8   0
5   128.1   1000    1016    0.9368

average 100.4433333 1104.833333 971.3816667 
Min     0           1000        21.49   
max     247.3       1404        3248`

winPerc = 1 意味着玩家以第一名获胜,同样 winPerc = 0 告诉我们玩家排在最后

print("The person who ends up winning the match usually drives {:.2f} , swims {:.2f} meters, has a walked {} meters".format(df.set_index('drove')['winPerc'].idxmax(),df.set_index('swimmed')['winPerc'].idxmax(),df.set_index('walked')['winPerc'].idxmax()))

为此,我得到:-

IndexError:元组索引超出范围

我想要的是如您在上面的数据框中看到的那样,id 为 0 和 2 的行具有 winPerc = 1 我应该得到如下响应: The person who ends up winning the match usually drives 170.52 , swims 1227 meters, has a walked 2015.2 meters如果有多个记录 winPerc =1 那么我应该得到相应的值

也可能有可能没有驾驶过的玩家(驾驶 = 0),并且,

赢得比赛 (winPerc = 1)

print("{} number of confident Players won without driving".format(len(df['drove'].min()['winPerc'].idxmax())))

为此,我收到此错误:-

IndexError:标量变量的索引无效。

如果没有行列值为 min() 或 max() 或 mean(),那么我应该采用接近该特定情况的值的值。

提前感谢,如果我需要解释更多,请告诉我。:)

标签: pythonpandasdataframe

解决方案


我复制了第一个打印件而没有更改任何内容,它对我来说很好用:

The person who ends up winning the match usually drives 247.30 , swims 1050.00 meters, has a walked 782.4 meters.

当您使用.format()并获取IndexError: tuple out of range它时,意味着您使用太少的变量调用它。


对于第二个问题,您需要过滤您的DataFrame. 这可以通过不同的方式来完成,使用布尔掩码是一种常见的方式。

>> drove_is_0 = df["drove"] == df['drove'].min()
>> is_winner =  df['winPerc'] == df['winPerc'].idxmax()

然后将您的过滤器应用于您的DataFrame

>> filtered = df[drove_is_0 & is_winner]

最后打印:

>> print("{} number of confident Players won without driving".format(len(filtered)))
1 number of confident Players won without driving

OP 澄清说,第一个问题不是关于提出的IndexError,而是关于过滤的。他们想要过滤值所在df的列,然后计算不同列的值。为了保持一致性,我将使用如上所示的布尔掩码:winPerc1mean

>> is_winner = df["winPerc"] == 1

>> mean_driven_winner = df[is_winner]["drove"].mean()
>> mean_swimmed_winner = df[is_winner]["swimmed"].mean()
>> mean_walked_winner = df[is_winner]["walked"].mean()

>> print("The person who ends up winning the match usually drives {:.2f} , swims {:.2f} meters, has a walked {} meters".format(
    mean_driven_winner, mean_swimmed_winner, mean_walked_winner)
)

The person who ends up winning the match usually drives 170.52 , swims 1227.00 meters, has a walked 2015.2 meters

推荐阅读