首页 > 解决方案 > 错误:模块“pandas”没有属性“read_pdf”

问题描述

使用从 pandas 导入 read_pdf 方法时

import pandas as pd

如示例所示

它显示以下错误消息

AttributeError:AttributeError:模块'pandas'没有属性'read_pdf'

环境

python --version: python 3.8.8
OS and it's version: ? windows 10
Anaconda (version 1.7.2)

我试图从已经存在的文件系统中读取 .pdf / .docx / .txt 类型的文件。

示例代码:

import pandas as pd
import os # Os moduel for Operating System opertions
import mimetypes

# To change the current working directory to a new directory we use
# os.chdir("Directory path")
os.chdir("C:\\Users\\adity\\Documents\\Parent") 

# To List the files and folders in the current working directory
# Return the file in form of List
fid = os.listdir() #filesInDirectory

# To check whether child 1 2 3 exist or not
def checkChild(d):
    if len(d) == 0:
        return False
    if 'Child1' in d and 'Child2' in d and 'Child3' in d:
        return True
    else:
        return False

# dc1 Directory Child One; dof Dictionary File System
# nof name of file , ext Extension

if checkChild(fid) == True: # if folders are there than read respective files
    for folder in fid:
        fileDir = folder
        os.chdir(directory+f"\\{fileDir}") # Changing directory to respective child
        dc = os.listdir()[0] # dc contains the name of file with extension
        nof,ext = os.path.splitext(dc)        
        if ext =='':
            ext = mimetypes.guess_extension(os.getcwd())
        
        if ext == '.pdf' and fileDir == 'Child1':
            child1Pdf = pd.read_pdf(f'{dc}')  #**Error Line**
            
            

错误输出:

*AttributeError Traceback (last last call last) in 9 print(dc) 10 if ext == '.pdf' and fileDir == 'Child1': ---> 11 child1Pdf = pd.read_pdf(f'{dc}') 12 13

~\anaconda3\lib\site-packages\pandas_init _.py in getattr (name) 242 return _SparseArray 243 --> 244 raise AttributeError(f"module 'pandas' has no attribute '{name}'") 245 246

AttributeError:模块“pandas”没有属性“read_pdf”*

我没有得到任何解决此错误的方法

标签: pythonpython-3.xpandasattributeerror

解决方案


如果您将 pdf 数据作为表格数据导入

import tabula
import pandas as pd

#declare the path of your file
file_path = "/path/to/pdf_file/data.pdf

#Convert your file
df = tabula.read_pdf(file_path)

推荐阅读