首页 > 解决方案 > 试图将列传递给函数,得到 keyerror (Pandas)

问题描述

我有这个代码块:

def euc_dist(x,y):
    return ((x[0] - y[0])**2 +(x[1] - y[1])**2 )**(1/2)

def dist(s1,s2):    
    distances = [euc_dist(s1[i],s2[i]) for i in range(s1.shape[0])]
    return pd.Series(distances)

distances_df = tracking_data.loc[:,tracking_data[['away_player10_point', 'away_player9_point', 'away_player8_point', 'away_player7_point', 'away_player6_point', 'away_player5_point', 'away_player4_point', 'away_player3_point', 'away_player2_point', 'away_player1_point', 'away_player11_point', 'home_player1_point', 'home_player2_point', 'home_player3_point', 'home_player4_point', 'home_player5_point', 'home_player6_point', 'home_player7_point', 'home_player8_point', 'home_player9_point', 'home_player10_point', 'home_player11_point']].apply(tuple, axis = 1)].apply(dist, args = (tracking_data["ball_point"]))
tracking_data["closest"] = distances_df.idxmin(axis = 1).apply(lambda x: str(x)[:-6])

运行时出现此错误:

KeyError:“没有[索引([
((-22.06,-8.32),(-0.12,21.38),(-1.49,-9.62),(-0.26,-28.52),(-19.32,16.22),(- 15.11, 0.43), (-7.69, 32.87), (0.45, -0.25), (-9.88, 7.67), (-47.29, -0.14), (-18.1, -25.42), (0.46, -19.84), ( 7.58, 4.82), (15.33, -23.38), (21.08, 6.57), (14.98, 20.7), (8.14, -4.27), (21.36, -9.06), (46.92, 0.01), (0.29, 9.88), (0.67, 22.24), (-0.06, -9.07)),\n
((-22.06, -8.32), (-0.07, 21.39), (-1.47, -9.64), (-0.23, -28.51), ( -19.31, 16.22), (-15.1, 0.42), (-7.68, 32.88), (0.46, -0.26), (-9.87, 7.7), (-47.3, -0.15), (-18.09, -25.41), (0.43, -19.83), (7.5600000000000005, 4.83), (15.31, -23.38), (21.06, 6.57), (14.97, 20.72), (8.12, -4.28), (21.33, -9.04), (46.91, 0.02 ), (0.25, 9.85), (0.67, 22.24), (-0.11, -9.05)),\n
((-22.06, -8.33), (-0.03, 21.39), (-1.43, -9.67), (-0.2, -28.5), (-19.29, 16.24), (-15.09, 0.42), (-7.66, 32.9), (0.47000000000000003, -0.27), (-9.85, 7.72), (-47.31, -0.16), (-18.08, -25.4), (0.39, -19.83), (7.55, 4.85), (15.28, - 23.38), (21.03, 6.57), (14.95, 20.74), (8.09, -4.28), (21.28, -9.02), (46.91, 0.03), (0.2, 9.82), (0.66, 22.24), (-0.16) , -9.02)),\n ((-22.06, -8.34), (0.01, 21.39), (-1.3900000000000001, -9.7), (-0.16, -28.5), (-19.28, 16.25), (-15.08, 0.42), (-7.64, 32.92), (0.49, -0.27), (-9.84, 7.75), (-47.32, -0.16), (-18.07, -25.4), (0.3500000000000...

请参考这个笔记本来查看我的表格,因为它太大而无法放在这里。与这个问题有关的工作在底部。 https://github.com/piercepatrick/Articles_EDA/blob/main/nashSCProject.ipynb

我一直在尝试在上一个问题中解决这个问题:Pandas Dataframe: Find the column with most coordinates point to another columns coordinate point

我有一种预感,这个问题在于源数据,因为我最初将它作为 JSON 加载?

标签: pythonpandas

解决方案


首先,重置索引

tracking_data = tracking_data.reset_index()

然后改变

distances_df = tracking_data.loc[:,tracking_data[['away_player10_point', 'away_player9_point', 'away_player8_point', 'away_player7_point', 'away_player6_point', 'away_player5_point', 'away_player4_point', 'away_player3_point', 'away_player2_point', 'away_player1_point', 'away_player11_point', 'home_player1_point', 'home_player2_point', 'home_player3_point', 'home_player4_point', 'home_player5_point', 'home_player6_point', 'home_player7_point', 'home_player8_point', 'home_player9_point', 'home_player10_point', 'home_player11_point']].apply(tuple, axis = 1)].apply(dist, args = (tracking_data["ball_point"]))

为了

distances_df = tracking_data[['away_player10_point', 'away_player9_point', 'away_player8_point', 'away_player7_point', 'away_player6_point', 'away_player5_point', 'away_player4_point', 'away_player3_point', 'away_player2_point', 'away_player1_point', 'away_player11_point', 'home_player1_point', 'home_player2_point', 'home_player3_point', 'home_player4_point', 'home_player5_point', 'home_player6_point', 'home_player7_point', 'home_player8_point', 'home_player9_point', 'home_player10_point', 'home_player11_point']].apply(dist, args = (tracking_data["ball_point"],))

推荐阅读