首页 > 解决方案 > 从 jupyter notebook 中的 pandas 数据框中选择单行时出现 Python Key 错误

问题描述

我已经设法使用 StackOverflow 解决了许多问题,但这是我第一次遇到在其他任何地方都找不到且无法自行解决的问题...

我正在使用带有 pandas 数据框的 jupyter notebook 工作,其中包含亚马逊产品的文本评论和分数。下面是我的代码:

import pandas as pd
data = pd.read_csv("AmazonSampleForStudentOffice.csv")
reviews = data[['reviewText', 'score', 'len_text']]
reviews.head(5)

这是结果:

reviewText  score   len_text
0   Wow! Do I consider myself lucky! I got this CX...   5   274
1   The Optima 45 Electric Stapler has a sleek mod...   5   108
2   This tape does just what it's supposed to.And ...   5   18
3   It is rare that I look for a more expensive pr...   5   104
4   I know of no printer that makes such great pri...   5   34

和切片数据框工作正常:

reviews[0:2]


reviewText  score   len_text
0   Wow! Do I consider myself lucky! I got this CX...   5   274
1   The Optima 45 Electric Stapler has a sleek mod...   5   108

但是,如果我想选择一行,jupyter 会在所选索引上引发 Key 错误:

reviews[0]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
c:\users\robin\appdata\local\programs\python\python38-32\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-7-a635d1333a53> in <module>
----> 1 reviews[0]

c:\users\robin\appdata\local\programs\python\python38-32\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2993             if self.columns.nlevels > 1:
   2994                 return self._getitem_multilevel(key)
-> 2995             indexer = self.columns.get_loc(key)
   2996             if is_integer(indexer):
   2997                 indexer = [indexer]

c:\users\robin\appdata\local\programs\python\python38-32\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2897                 return self._engine.get_loc(key)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2901         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

有谁知道是什么导致了这个问题?我觉得切片工作正常很奇怪,但是选择单个索引会引发错误......

如您所见,我尝试了不同的方法从数据框中选择某些行,它们都工作正常。我也尝试重新安装 pandas 和 jupyter notebook,但它仍然抛出错误......

提前致谢!

标签: python-3.xpandasdataframejupyter-notebook

解决方案


单独的索引运算符 like inreviews[]仅适用于通过布尔表达式选择行 - 例如使用类似切片reviews[:2](您的 0 已过时) - 或选择类似 in 的列reviews['score']。如果要按位置索引,则需要 .ilog 属性,例如 in reviews.iloc[0, :],它只为您提供第一行,但提供所有列。

如果您想了解 pandas 索引,请关注 .loc 和 .iloc 属性,它们都适用于二维。单独的索引运算符只能用于选择一维并且有相当多的限制。


推荐阅读