首页 > 解决方案 > 如何在 python pdfminer3k 中解决这个警告?

问题描述

当我在 python IDLE 中运行这些代码时,我收到了这个警告,如何解决这个问题?

警告:根:无法找到 objid= nnn

# -*- coding: utf-8 -*- 
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from io import StringIO
from io import open

def readPDF(pdfFile):
    rsrcmgr = PDFResourceManager()
    retstr = StringIO()
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, laparams=laparams)

    process_pdf(rsrcmgr, device, pdfFile)
    device.close()

    content = retstr.getvalue()
    retstr.close()
    return content


def saveTxt(txt):
    with open("xxx.txt", "w",encoding="utf-8") as f:
        f.write(txt)


txt = readPDF(open('xxx.pdf', 'rb'))
saveTxt(txt)

当我添加STRICT = True到 psparser.py 并运行程序时,它返回如下:

Traceback (most recent call last):
  File "F:\Users\IceSun\Documents\py\pdfminer3kTest.py", line 30, in <module>
    txt = readPDF(pdfAhh)
  File "F:\Users\IceSun\Documents\py\pdfminer3kTest.py", line 17, in readPDF
    process_pdf(rsrcmgr, device, pdfFile)
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfinterp.py", line 695, in process_pdf
    doc.set_parser(parser)
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfparser.py", line 434, in set_parser
    self.info.append(dict_value(trailer['Info']))
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdftypes.py", line 92, in typecheck_value
    x = resolve1(x)
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdftypes.py", line 58, in resolve1
    x = x.resolve()
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdftypes.py", line 47, in resolve
    return self.doc.getobj(self.objid)
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfparser.py", line 532, in getobj
    result = self._getobj(objid)
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfparser.py", line 345, in _getobj
    handle_error(PDFSyntaxError, 'Cannot locate objid=%r' % objid)
  File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\psparser.py", line 20, in handle_error
    raise exctype(msg)
pdfminer.pdfparser.PDFSyntaxError: Cannot locate objid=1875

添加于 16:46 GTM+08 2018/06/08

标签: pythonpdfminer

解决方案


推荐阅读