首页 > 解决方案 > 使用python将图像转换为txt

问题描述

我有这个应该将图像转换为字符串的 .py 文件。

img2str.py:

from PIL import Image
from pytesseract import image_to_string

image = Image.open('image.png', mode='r')
print(image_to_string(image))

我试过了:

python3 img2str.py

我有:

Traceback (most recent call last):
  File "/home/linux/.local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 223, in run_tesseract
    proc = subprocess.Popen(cmd_args, **subprocess_args())
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'tesseract': 'tesseract'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "img2str.py", line 4, in <module>
    results = tes.image_to_string(Image.open('image.png'))
  File "/home/linux/.local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 345, in image_to_string
    }[output_type]()
  File "/home/linux/.local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 344, in <lambda>
    Output.STRING: lambda: run_and_get_output(*args),
  File "/home/linux/.local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 253, in run_and_get_output
    run_tesseract(**kwargs)
  File "/home/linux/.local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 225, in run_tesseract
    raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

安装正方体:

sudo pip install pytesseract --user
[sudo] heslo pro linux:    
Traceback (most recent call last):
  File "/usr/bin/pip", line 9, in <module>
    from pip import main
ImportError: cannot import name main

另一种方式:第二次安装commad,好像安装好了。

pip install pytesseract --user
Requirement already satisfied: pytesseract in ./.local/lib/python3.6/site-packages (0.3.0)
Requirement already satisfied: Pillow in /usr/lib/python3/dist-packages (from pytesseract) (5.1.0)

标签: python-3.x

解决方案


您可能tesseract-ocr从您的机器中丢失了。在此处查看安装说明:https ://github.com/tesseract-ocr/tesseract/wiki

或者在 linux 上,您可以使用 apt 安装:

sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

如果这些命令不能解决问题,您必须参考上面链接的 wiki。


推荐阅读