python - Pandas ValueError:只能将大小为 1 的数组转换为 Python 标量
问题描述
使用以下代码:
#Bring in the 'player matches' dataframe
df_pm = sql('select * from PlayerMatchesDetail', c).drop('TableIndex', axis=1)
df_pm['GoalInv'] = df_pm['Goals']+df_pm['GoalAssists']
df_pm.head(3) # THIS PRINTS FINE (see below)
# We need to associate a match ID to each row here, so that we can groupby properly.
def MatchIDLookup(gw, ht, at):
'''
Takes a gameweek, hometeam, and awayteam,
and returns the matchID of the game
'''
return int(df_fixtures.loc[(df_fixtures['GameWeek']==gw)
&(((df_fixtures['HomeTeam']==ht)
&(df_fixtures['AwayTeam']==at))
|((df_fixtures['HomeTeam']==at)
&(df_fixtures['AwayTeam']==ht))),'MatchID'].item())
#Apply the function to insert the matchID
df_pm['MatchID'] = df_pm.apply(lambda x: MatchIDLookup(x['GameWeek'],
x['ForTeam'],
x['AgainstTeam']), axis=1)
#Create a multi-index
df_pm.set_index(['MatchID','Player'], inplace=True)
#We now create columns in the player match dataframe, describing their expected goals, assists, and goal involvement.
#Goals
df_pm['XG'] = df.groupby(['MatchID','Player']).sum()[['XG']]
#Assists
df_pm['XA'] = df.groupby(['MatchID','AssistedBy']).sum()[['XG']]
#Fill NAs with 0s
df_pm.fillna(0, inplace=True)
#Calculate goal Involvement
df_pm['XGI'] = df_pm['XG'] + df_pm['XA']
# Let's see how player gameweeks are distributed...
plt.figure(figsize=(10,3))
plt.hist(df_pm['XG'], label='XG', bins=30)
plt.xlim(0)
plt.ylim(0,1000)
plt.title('Distribution of player XG in each match')
plt.figure(figsize=(10,3))
plt.hist(df_pm['XA'], label='XGA', bins=30, color=color_list[1])
plt.xlim(0)
plt.ylim(0,1000)
plt.title('Distribution of player XA in each match')
plt.figure(figsize=(10,3))
plt.hist(df_pm['XGI'], label='XGI', bins=30, color=color_list[2])
plt.xlim(0)
plt.ylim(0,1000)
plt.title('Distribution of player XGI in each match');
plt.show()
我得到以下回溯:
Traceback (most recent call last):
File "expected_goals.py", line 365, in <module>
x['AgainstTeam']), axis=1)
File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/frame.py", line 6878, in apply
return op.get_result()
File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/apply.py", line 186, in get_result
return self.apply_standard()
File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/apply.py", line 296, in apply_standard
values, self.f, axis=self.axis, dummy=dummy, labels=labels
File "pandas/_libs/reduction.pyx", line 620, in pandas._libs.reduction.compute_reduction
File "pandas/_libs/reduction.pyx", line 128, in pandas._libs.reduction.Reducer.get_result
File "expected_goals.py", line 365, in <lambda>
x['AgainstTeam']), axis=1)
File "expected_goals.py", line 360, in MatchIDLookup
&(df_fixtures['AwayTeam']==ht))),'MatchID'].item())
File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/base.py", line 652, in item
return self.values.item()
ValueError: can only convert an array of size 1 to a Python scalar
笔记:
df.fixtures
打印良好:
MatchID GameWeek Date HomeTeam AwayTeam
FixturesBasicID
1 46605 1 2019-08-09 Liverpool Norwich City
2 46606 1 2019-08-10 Bournemouth Sheffield United
3 46607 1 2019-08-10 Burnley Southampton
4 46608 1 2019-08-10 Crystal Palace Everton
5 46609 1 2019-08-11 Leicester City Wolverhampton Wanderers
而且,在使用之前MatchIDLookup()
,df_pm.head(3)
也打印得很好:
Player GameWeek Minutes ForTeam ... CreatedCentre CreatedLeft CreatedRight GoalInv
PlayerMatchesDetailID ...
1 Alisson 1 90 Liverpool ... 0 0 0 0
2 Virgil van Dijk 1 90 Liverpool ... 0 0 0 1
3 Joseph Gomez 1 90 Liverpool ... 0 0 0 0
我该如何解决?
解决方案
如果不尝试,我相信问题在于功能int()
的返回MatchIDLookup()
。Pandas 通常不允许这样做。相反,返回值而不转换为 int,然后在下面添加:
df_pm['MatchID'] = df_pm['MatchID'].astype(int)
PS 另外,我通常建议不要将任何类型的 ID 转换为整数,但将其保留为字符串 - 如果 id 以零(0654 或 0012)开头,通过将其转换为整数,您将失去 4 位格式。
编辑:
def MatchIDLookup(gw, ht, at):
res = df_fixtures.loc[(df_fixtures['GameWeek']==gw)
&(((df_fixtures['HomeTeam']==ht)
&(df_fixtures['AwayTeam']==at))
|((df_fixtures['HomeTeam']==at)
&(df_fixtures['AwayTeam']==ht))),'MatchID']
return res.item() if len(res) > 0 else 'not found' ```
推荐阅读
- php - 当 SPA 中的路由更改时,http 请求会发生什么
- css - 如何在每个 2 列/1 行视图中显示每 4 个 LI 后设置 LI 样式?
- sql - SQL Server:获取第一列中的最后 4 个字符并获取第二列中单词的第一个字母,但忽略非字母
- ios - 更改选项卡时以模态方式关闭 ViewController
- jquery - 如何使用jquery更改materializecss select中的选定值
- angular - 带有角度自定义管道的离子 v4 不起作用
- math - 哪些公理可以安全地添加到 Coq 中?
- node.js - 节点缓冲区没有正确下载文件?
- amazon-web-services - 可以跨账户 Kinesis Firehose 吗?
- python - 如何在 django 中编写视图以更新模型图像?