python-3.x - pdf2image 路径中的 Poppler
问题描述
我正在尝试使用 pdf2image ,似乎我需要一些名为propeller
:
(sum_env) C:\Users\antoi\Documents\Programming\projects\summarizer>python ocr.py -i fr13_idf.pdf
Traceback (most recent call last):
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 165, in __page_count
proc = Popen(["pdfinfo", pdf_path], stdout=PIPE, stderr=PIPE)
File "C:\Python37\lib\subprocess.py", line 769, in __init__
restore_signals, start_new_session)
File "C:\Python37\lib\subprocess.py", line 1172, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ocr.py", line 53, in <module>
pdfspliterimager(image_path)
File "ocr.py", line 32, in pdfspliterimager
pages = convert_from_path("document-page%s.pdf" % i, 500)
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 30, in convert_from_path
page_count = __page_count(pdf_path, userpw)
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 169, in __page_count
raise Exception('Unable to get page count. Is poppler installed and in PATH?')
Exception: Unable to get page count. Is poppler installed and in PATH?
我试过这个链接,但下载的东西并没有解决我的问题。
解决方案
pdf2image 只是poppler(不是螺旋桨!)的包装,要使用你需要在你的机器和路径中安装 poppler-utils 的模块。
该过程在“如何安装”部分的项目自述文件中链接。
推荐阅读
- java - org.hibernate.AnnotationException: mappedBy 引用了一个未知的目标实体属性。出现错误:java.lang.NullPointerException
- python-3.x - 使用正则表达式在字符串中查找数字
- javascript - 在会话中强制布局保存缩放比例:d3.event.scale
- windows - bat Windows 中具有不同 Magick 命令的嵌套循环
- apache-spark - 外部 Hive 元存储的 Spark-SQL 错误
- azure - 我应该在更新期间停止 Azure 应用服务吗?
- powerbi - 在 Power BI/Power Query 中添加缺失的日期行并取上面行的值
- r - 根据列值将 df 子集到 n-bins
- python - 从 docker 容器将 csv 文件写入本地主机
- c++ - 嵌套类无法访问,g++ 没有给出错误,而 clang 给出了