首页 > 解决方案 > 不支持的格式或损坏的文件:预期的 BOF 记录

问题描述

我在使用 for 循环遍历目录时在 .xlsx 文件上使用 read_excel 时遇到问题。执行以下操作时:

df1 = pd.read_excel('***.xlsx')
df2 = pd.read_excel('***.xlsx')

df = pd.concat([df1,df2], ignore_index=True, sort=False)

我没有问题。但是当我尝试在循环中读取相同的文件时:

directories_to_check = [
        "C:\\Users\\***"
        ]
files = []
for directory in directories_to_check:
    directories = [d for d in listdir(directory) if isdir(join(directory,d))]
    print(directories)
    for root in directories:
        path = join(directory,root)
        print(path)
        for file in listdir(path):
            print(file)
            if file[0:5].lower() == 'name':
                files.append(join(directory,root,file))
            else:
                files.append(join(directory,root,file))

for filetoopen in files:
    print(filetoopen)
    df = pd.concat([pd.read_excel(filetoopen, header=1)], ignore_index=True, sort=False)

我收到以下错误:

Traceback (most recent call last):
  File "C:/Users/***/Sample.py", line 30, in <module>
    df = pd.concat([pd.read_excel(filetoopen, header=1)], ignore_index=True, sort=False)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\pandas\util\_decorators.py", line 208, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\pandas\io\excel\_base.py", line 310, in read_excel
    io = ExcelFile(io, engine=engine)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\pandas\io\excel\_base.py", line 819, in __init__
    self._reader = self._engines[engine](self._io)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\pandas\io\excel\_xlrd.py", line 21, in __init__
    super().__init__(filepath_or_buffer)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\pandas\io\excel\_base.py", line 359, in __init__
    self.book = self.load_workbook(filepath_or_buffer)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\pandas\io\excel\_xlrd.py", line 36, in load_workbook
    return open_workbook(filepath_or_buffer)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\xlrd\__init__.py", line 157, in open_workbook
    ragged_rows=ragged_rows,
  File "C:\Users\***\Python\Python37-32\lib\site-packages\xlrd\book.py", line 92, in open_workbook_xls
    biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
  File "C:\Users\***\Python\Python37-32\lib\site-packages\xlrd\book.py", line 1278, in getbof
    bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
  File "C:\Users\***\Python\Python37-32\lib\site-packages\xlrd\book.py", line 1272, in bof_error
    raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'C:\\Users'

任何帮助将不胜感激,谢谢!

标签: pythonexcelpython-3.xxlrd

解决方案


推荐阅读