python - 如何在 python pdfminer3k 中解决这个警告?
问题描述
当我在 python IDLE 中运行这些代码时,我收到了这个警告,如何解决这个问题?
警告:根:无法找到 objid= nnn
# -*- coding: utf-8 -*-
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from io import StringIO
from io import open
def readPDF(pdfFile):
rsrcmgr = PDFResourceManager()
retstr = StringIO()
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, laparams=laparams)
process_pdf(rsrcmgr, device, pdfFile)
device.close()
content = retstr.getvalue()
retstr.close()
return content
def saveTxt(txt):
with open("xxx.txt", "w",encoding="utf-8") as f:
f.write(txt)
txt = readPDF(open('xxx.pdf', 'rb'))
saveTxt(txt)
当我添加STRICT = True
到 psparser.py 并运行程序时,它返回如下:
Traceback (most recent call last):
File "F:\Users\IceSun\Documents\py\pdfminer3kTest.py", line 30, in <module>
txt = readPDF(pdfAhh)
File "F:\Users\IceSun\Documents\py\pdfminer3kTest.py", line 17, in readPDF
process_pdf(rsrcmgr, device, pdfFile)
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfinterp.py", line 695, in process_pdf
doc.set_parser(parser)
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfparser.py", line 434, in set_parser
self.info.append(dict_value(trailer['Info']))
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdftypes.py", line 92, in typecheck_value
x = resolve1(x)
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdftypes.py", line 58, in resolve1
x = x.resolve()
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdftypes.py", line 47, in resolve
return self.doc.getobj(self.objid)
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfparser.py", line 532, in getobj
result = self._getobj(objid)
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\pdfparser.py", line 345, in _getobj
handle_error(PDFSyntaxError, 'Cannot locate objid=%r' % objid)
File "D:\Program_Files\Python\Python36\lib\site-packages\pdfminer\psparser.py", line 20, in handle_error
raise exctype(msg)
pdfminer.pdfparser.PDFSyntaxError: Cannot locate objid=1875
添加于 16:46 GTM+08 2018/06/08
解决方案
推荐阅读
- html - 如何添加标头元数据而不添加
- angular - Observable 只更新一次 UI
- sql-server - XML - 添加不在根目录中的命名空间
- swift - 如何在 SwiftUI MapKit 的 updateUIView 中添加“.setRegion”?
- javascript - 我想在地图功能中创建一个单独的过程,例如点击切换
- ibm-mq - IBM MQ 队列管理器 CCSID
- automated-tests - 无法在 Kiwi TCMS 中将 Gitlab 配置为 Bug 跟踪器
- javascript - JavaScript 对象数组包含另一个数组的每个元素
- c++ - 代码在 Visual Studio 中不起作用,但在 C++ shell 和 CodeBlock 中起作用,编译器问题?
- c# - 如何使用 elasticsearch Nest 客户端通过 _id 查询特定文档