首页 > 解决方案 > 不能在数据框中使用索引

问题描述

我想将一个excel文件读入数据帧并在str时间重新索引并使用loc对其一部分进行切片,但在jupyter笔记本环境中它一步一步没有错误但如果我在一个单元格中运行它得到一个错误

In[2]: fileS = open('my path to file','rb')
       for sheet in wbS.worksheets:
           df = pd.DataFrame(sheet.values)
       df
Out[2]:     0   1   2   3   4   5   6   7   8   ...

      0   None  28S     NaN     NaN     ...     
      1 2017/10/21  850     860.0   870.0   =1600-AVERAGE(BB2:DD2)  
      2 018/08/11   None    NaN     NaN     693     
      3 2018/10/17  925     901.0   923.0   683.667
      4 2018/11/12  959     960.0   966.0   638.333  
      5 2018/12/27  None    NaN     NaN     818.333     
 In[3]: df1 = df.set_index(0)
        df1
 Out[3]:        1   2   3   4   5   6   7   8   ...

           0
         NaN    28S     NaN     NaN     ...     
       2017/10/21   850     860.0   870.0   =1600-AVERAGE(BB2:DD2)  
       018/08/11    None    NaN     NaN     693     
       2018/10/17   925     901.0   923.0   683.667
       2018/11/12   959     960.0   966.0   638.333  
  In[4]:rdfS = df1.loc[['2018/08/11','2018/10/17','2018/11/12','2018/12/27'],[4]]
  Out[4]:            4
           0    
       2018/08/11   676.5
       2018/10/17   667.3
       2018/11/12   624.3
       2018/12/27   799

顺便说一句,时间 str 是我写入使用的 excel 文件的 str

wb['Sheet1']['A4'] = '2018/10/17'

但我在一个单元格中运行它会出错

In[1] import openpyxl
      import numpy as np
      import pandas as pd
      fileS = open('my path to file','rb')
      for sheet in wbS.worksheets:
          df = pd.DataFrame(sheet.values)
          df1 = df.set_index(0)
          rdfS = df1.loc[['2018/08/11','2018/10/17','2018/11/12','2018/12/27'],[4]]
 KeyError
         14     df = pd.DataFrame(sheet.values)
         15     df1 = df.set_index(0,drop = True)
--->     16     rdfS = df1.loc[['2018/08/11','2018/10/17','2018/11/12','2018/12/27'],[4]]
      in __getitem__(self, key)
    1365             except (KeyError, IndexError):
    1366                 pass
 -> 1367             return self._getitem_tuple(key)
    1368         else:
    1369             # we by definition only have the 0th axis

     in _getitem_tuple(self, tup)
    861 
    862         # no multi-index, so validate all of the indexers
--> 863         self._has_valid_tuple(tup)
    864 
    865         # ugly hack for GH #836
  in _has_valid_tuple(self, key)
     202             if i >= self.obj.ndim:
     203                 raise IndexingError('Too many indexers')
 --> 204             if not self._has_valid_type(k, i):
     205                 raise ValueError("Location based indexing can         only have "
     206                                  "[{types}] types"
in _has_valid_type(self, key, axis)
     1470                         raise KeyError(
     1471                             u"None of [{key}] are in the [{axis}]".format(
  -> 1472                                 key=key,   axis=self.obj._get_axis_name(axis)))
     1473                     else:
     1474 

     KeyError: "None of [['2018/08/11', '2018/10/17', '2018/11/12', '2018/12/27']] are in the [index]"

我该如何纠正它?是我的环境错误还是什么?

标签: python-3.xpandas

解决方案


当您在 jupyter 中运行时,您在 excel 表的最后一张表(for 循环中的最后一张表)上执行操作(set_index, loc等)。当您在新 for 循环中的所有工作表上运行操作时,其中一张工作表没有您要尝试的日期键.loc[]


推荐阅读