python - Pandas - 从元组列表中获取值并根据条件将它们映射到新列上的值
问题描述
我有这个数据框,df_match
:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 match_id 680 non-null int64
1 league_id 680 non-null object
2 from_home_player_1_to_home_player_11 680 non-null object
列上的每一行都from_home_player_1_to_home_player_11
保存一个元组列表,如下所示:
df_match.sample(1)
:
...
None match_id league_id from_home_player_1_to_home_player_11
167 243221 26 [(79066, GKP), (82634, MID), (79578, FWD), (34765, DEF), (23476, WING), (32456, MID),(55897, DEF),(45675, MID),(32345, FWD),(45765,FWD),(12354, WING)]
目标
现在我想为场上的每个球员设置 X/Y 坐标(这里只使用坐标 X 以简化它),每场比赛(行)
每个玩家都from_home_player_1_to_home_player_11
需要一个 X 值。所以我需要一个新创建的 X 列的列表,如下所示:
X_columns = ["home_player_X1", "home_player_X2", "home_player_X3","home_player_X4", "home_player_X5",
"home_player_X6", "home_player_X7", "home_player_X8", "home_player_X9","home_player_X10", "home_player_X11",
最后,每个位置都有一组任意的 X 值。(当有多个选项时,可以是其中任何一个,随机选择)
GKP = 1
DEF = [3,4]
WING = [2,5]
MID = [6,7,8]
FWD = [9,10,11]
我的目标是在每一行将玩家位置映射到 X 坐标,最终得到:
None match_id league_id from_away_player_1_to_away_player_11 /
167 243221 26 [(79066, GKP), (82634, MID), (79578, FWD), (34765, DEF), (23476, WING), (32456, MID),(55897, DEF),(45675, MID),(32345, FWD),(45765,FWD),(12354, WING)] /
home_player_X1 home_player_X2 home_player_X3 home_player_X4
1 7 10 3
home_player_X5 home_player_X6 home_player_X7 home_player_X8
5 7 4 7
home_player_X9 home_player_X10 home_player_X11
10 10 2
如何根据熊猫的位置/值条件进行此映射?
我开始考虑通过以下方式迭代数据框:
for index, value in df_match.iterrows():
pos = value.from_home_player_1_to_home_player_11[1][1]
print (index, value)
但我并没有走得太远。
解决方案
类似于您的数据:
df_match = pd.DataFrame( { "match_id" : [243221, 234251], 'league_id' : [26, 11],
'from_home_player_1_to_home_player_11' : [ [(79066, 'GKP'), (82634, 'MID'), (79578, 'FWD'), (34765, 'DEF'), (23476, 'WING'),
(32456, 'MID'), (55897, 'DEF'), (45675, 'MID'), (32345, 'FWD'), (45765,'FWD'),
(12354, 'WING')],
[(14825, 'GKP'), (82634, 'MID'), (79578, 'FWD'), (34765, 'DEF'), (23476, 'WING'),
(32456, 'MID'), (55897, 'MID'), (45675, 'MID'), (32345, 'DEF'), (45765,'FWD'),
(12354, 'WING')],
] }, index=[167, 1999])
建立一个位置映射,注意都是列表:
pmap = {'GKP' : [1], 'DEF': [3,4], 'WING' : [2,5], 'MID' : [6,7,8], 'FWD' : [9,10,11] }
从字典中进行查找,选择一个随机选项,然后分解为单独的列。重命名列:
import random
tmp = df_match['from_home_player_1_to_home_player_11'].apply(lambda x: [ random.choice(pmap.get(pos, -1)) for n, pos in x]).apply(pd.Series)
tmp.columns = [f"home_player_X{i}" for i in range(1,12)]
请注意,-1
如果未找到密钥,它将放置在该位置。然后pd.concat()
他们在一起:
df2 = pd.concat([df_match, tmp], axis=1)
推荐阅读
- python-3.x - 如何在熊猫数据透视表数据框中添加列的总计
- java - Flux:过滤异常
- findbugs - 使用 Gradle 插件运行时 Spotbugs XML 报告没有 instanceHash 值
- sql - 循环模式仅对某些数字返回错误值
- android - 运行应用程序时找不到 EasyPermissions 库错误?
- python - 处理组合,但某些元素不能一起使用
- python - 如何控制格式化字符串中浮点数的小数
- react-native - 无法在箭头功能内的组件上使用导航。未定义不是对象(评估“n.props.navigation”)
- python - SQLalchemy join 得到 A 不在 B
- xcode - 执行 XCUI 测试时更改模拟器键盘