python - KeyError,熊猫做奇怪的事情
问题描述
我一直在寻找有关此错误的几天,但没有任何改善。似乎 pandas 迭代了数据框,然后再一次。但是,似乎在特定迭代中迭代键时,会引发 KeyError 。可能都是解释器的问题,还是我的代码有错误?任何帮助将不胜感激。
更多背景:
features_df 的子集:https ://www.transfernow.net/3yS7pE092020
提供给函数的参数:np.array of IDs (
dtype=int
),将在数据集中进行搜索
这里有代码:
def extract_features(id_arr):
features_df = pd.read_csv(r'D:\fma_metadata\features.csv', index_col=0, na_values=['NA'], encoding='utf-8')
features = np.array(features_df.columns)
id_arr = np.asarray(id_arr, dtype=int)
for id in id_arr:
row_features = []
for key, value in features_df.iteritems():
number = float(features_df[key][id])
row_features.append(round(number, 6))
row_features = np.asarray(row_features)
features = np.vstack((features, row_features))
features = np.delete(features, 0, 0)
return features
random_id = get_random_id()
extract_features(random_id)
错误:
Traceback (most recent call last):
File "C:/Users/*****/PycharmProjects/****/emotions-nn/deep-learning/input.py", line 65, in <module>
print(extract_features(random_id))
File "C:/Users/*****/PycharmProjects/****/emotions-nn/deep-learning/input.py", line 51, in extract_features
number = float(features_df[key][id])
File "C:\Users\*****\anaconda3\envs\tensorflow\lib\site-packages\pandas\core\series.py", line 882, in __getitem__
return self._get_value(key)
File "C:\Users\*****\anaconda3\envs\tensorflow\lib\site-packages\pandas\core\series.py", line 991, in _get_value
loc = self.index.get_loc(label)
File "C:\Users\*****\anaconda3\envs\tensorflow\lib\site-packages\pandas\core\indexes\base.py", line 2891, in get_loc
raise KeyError(key) from err
KeyError: 800
解决方案
我猜它可能是你的多级索引。
# ids can be a list of integers too
def extract(ids: np.ndarray):
# assuming the first 3 rows are "headers"
df = pd.read_csv(r"C:\Users\danie\Downloads\features - subset.csv", header=[0,1,2], index_col=0, na_values=['NA'])
# you can set a breakpoint here to see the current column order
# print(df.columns)
# and reorganize the way you want it
# this is basically what you're trying to do if I'm not mistaken
return df.loc[ids].round(6).to_numpy()
# if there's a column order
return df.loc[ids, order].round(6).to_numpy()
推荐阅读
- plotly-dash - 突出显示 Dash 数据表中的选定行
- python - Selenium 在 Chrome 中有效,但在使用无头 Chrome 时无效
- docker - Makefile - 从另一个目标返回值
- python - 如何从单个图制作各个区域的子图
- python - Python 统计面板数据的每日变化次数
- django - Django - 每个用户只允许一个会话导致错误请求
- python - 如何从 xml 文件访问 elementtree 数据库中特定元素的子元素
- python - 防止应用程序的多个实例运行的 Pythonic 方法
- sql - 我的 postgresql 脚本中的 RETURNING 有什么问题?
- azure-deployment - 部署对象的哪些数据在其生命周期内可能会发生变化?