首页 > 解决方案 > Pandas 使用时出现 KeyError ;作为分隔符

问题描述

我是 Python 新手,不确定为什么会这样。我正在尝试导入一个小 csv,通过“;”分隔列,然后绘制关闭。这是我的代码:

import pandas as pd
import matplotlib.pyplot as pp
import numpy as np
import csv

eur5m = pd.read_csv("eurusd-5m.csv", delimiter=';', parse_dates=['Date'])

eur5m = eur5m.sort_values(by="Date")

eur5m['Close'].plot(figsize=(16,12))

当我这样做时,我收到此错误:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.5/site-packages/pandas/core/indexes/base.py in 
get_loc(self, key, method, tolerance)
   2524             try:
-> 2525                 return self._engine.get_loc(key)
   2526             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in 
pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in 
pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Close'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-32-5789795df4bb> in <module>()
      8 eur5m = eur5m.sort_values(by="Date")
      9 
---> 10 eur5m['Close'].plot(figsize=(16,12))

~/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in 
__getitem__(self, key)
   2137             return self._getitem_multilevel(key)
   2138         else:
-> 2139             return self._getitem_column(key)
   2140 
   2141     def _getitem_column(self, key):

~/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in 
_getitem_column(self, key)
   2144         # get column
   2145         if self.columns.is_unique:
-> 2146             return self._get_item_cache(key)
   2147 
   2148         # duplicate columns & possible reduce dimensionality

~/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py in 
_get_item_cache(self, item)
   1840         res = cache.get(item)
   1841         if res is None:
-> 1842             values = self._data.get(item)
   1843             res = self._box_item_values(item, values)
   1844             cache[item] = res

~/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py in get(self, 
item, fastpath)
   3841 
   3842             if not isna(item):
-> 3843                 loc = self.items.get_loc(item)
   3844             else:
   3845                 indexer = np.arange(len(self.items)) 
[isna(self.items)]

~/anaconda3/lib/python3.5/site-packages/pandas/core/indexes/base.py in 
get_loc(self, key, method, tolerance)
   2525                 return self._engine.get_loc(key)
   2526             except KeyError:
-> 2527                 return 
self._engine.get_loc(self._maybe_cast_indexer(key))
   2528 
   2529         indexer = self.get_indexer([key], method=method, 
tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in 
pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in 
pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Close'

这是我的 CSV 的样例。

Date; Time; Open; High; Low; Close; Volume
07/08/2018;20:15:00;1.16006;1.16012;1.15995;1.16;300
07/08/2018;20:20:00;1.16001;1.16023;1.16;1.16016;337
07/08/2018;20:25:00;1.16014;1.16023;1.16011;1.16012;225
07/08/2018;20:30:00;1.16012;1.16031;1.15997;1.16029;333
07/08/2018;20:35:00;1.16027;1.16095;1.16024;1.16082;509

当我按日期排序时,它不能正确排序。它似乎只有在我执行 eur5m.iloc[::-1] 时才有效。这是在暗示什么吗?

标签: pythonpython-3.xpandasmatplotlib

解决方案


我认为需要的问题是空格,所以可能的解决方案是参数skipinitialspace=True

eur5m = pd.read_csv("eurusd-5m.csv", 
                    delimiter=';', 
                    parse_dates=['Date'], 
                    skipinitialspace=True)

str.strip使用列名:

eur5m.columns = eur5m.columns.str.strip()

推荐阅读