首页 > 解决方案 > 无法获取页数。poppler 是否已安装并在 PATH 中?

问题描述

我一直试图让这部分代码执行,但错误不断弹出。我已经将 poppler 添加到 evn 路径。还有什么我可以做的吗?

代码:

!pip install pdf2image
!pip install opencv-python
!pip install PILLOW
!pip install pytesseract
!pip install poppler-utils
from pdf2image import convert_from_path
import cv2
from PIL import Image
import pytesseract

from pdf2image import convert_from_bytes 
pdfs = r"C:\Users\sreeh\OneDrive\Desktop\OCR\Invoice/pdf"
# (Above statements executed well)

pages = convert_from_path('Invoice.pdf', 350)
Error after execution of thia statement:-
FileNotFoundError                         Traceback (most recent call last)
~\anaconda3\lib\site-packages\pdf2image\pdf2image.py in pdfinfo_from_path(pdf_path, userpw, poppler_path, rawdates, timeout)
    444             env["LD_LIBRARY_PATH"] = poppler_path + ":" + env.get("LD_LIBRARY_PATH", "")
--> 445         proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE)
    446 

~\anaconda3\lib\subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
    799                                 errread, errwrite,
--> 800                                 restore_signals, start_new_session)
    801         except:

~\anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
   1206                                          os.fspath(cwd) if cwd is not None else None,
-> 1207                                          startupinfo)
   1208             finally:

FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

PDFInfoNotInstalledError                  Traceback (most recent call last)
<ipython-input-6-6623e242222a> in <module>
----> 1 pages = convert_from_path('Invoice.pdf', poppler_path = 'C:\Program Files\poppler-0.68.0\bin')
      2 #convert_from_bytes(open('Invoice.pdf').read())

~\anaconda3\lib\site-packages\pdf2image\pdf2image.py in convert_from_path(pdf_path, dpi, output_folder, first_page, last_page, fmt, jpegopt, thread_count, userpw, use_cropbox, strict, transparent, single_file, output_file, poppler_path, grayscale, size, paths_only, use_pdftocairo, timeout)
     95         poppler_path = poppler_path.as_posix()
     96 
---> 97     page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
     98 
     99     # We start by getting the output format, the buffer processing function and if we need pdftocairo

~\anaconda3\lib\site-packages\pdf2image\pdf2image.py in pdfinfo_from_path(pdf_path, userpw, poppler_path, rawdates, timeout)
    470     except OSError:
    471         raise PDFInfoNotInstalledError(
--> 472             "Unable to get page count. Is poppler installed and in PATH?"
    473         )
    474     except ValueError:

PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?

标签: pythonocrpopplerpoppler-utils

解决方案


推荐阅读