首页 > 解决方案 > Tesseract-OCR、Python、计算机视觉

问题描述

我想为自己的新字体训练 tesseract,但我没有找到任何方法来做到这一点。我无法从图像创建盒子文件。我是编程语言的新手,有人告诉我 LabelImg 但它对 tesseract ocr 没有用。

请帮助我建议一个工具来标记图像中的文本,这是 tesseract ocr 的新功能。

标签: python-3.xcomputer-visiontesseracttext-recognition

解决方案


您可以创建自己的脚本来标记图像。这是一些示例代码,可以让您这样做,您可以根据需要自定义它

import sys
import os

import cv2


def isImage(filepath) -> bool:
    '''
    checks if file is an image
    '''

    lowercasePath = filepath.lower()

    # you can add more formats here
    cases = [
        lowercasePath.endswith('jpg'),
        lowercasePath.endswith('png'),
        lowercasePath.endswith('jpeg'),
    ]

    return any(cases)



def getPaths(imgdir, condition=lambda x: True):
    '''
    given path to image folder will return you a list of full paths
    to files which this folder contain

    :param condition: is a function that will filter only those files
    that satisfy condition
    '''

    files = map(lambda x: os.path.join(imgdir, x).strip(),
        os.listdir(imgdir))

    filtered = filter(condition, files)

    return list(filtered)



def labelingProcess(imgdir):
    print("Welcome to the labeling tool")
    print("if you want to stop labeling just close the program or press ctrl+C")

    WIDTH = 640
    HEIGHT = 480

    WINDOWNAME = "frame"
    window = cv2.namedWindow(WINDOWNAME, cv2.WINDOW_NORMAL)
    cv2.resizeWindow(WINDOWNAME, WIDTH, HEIGHT)
    cv2.moveWindow(WINDOWNAME, 10, 10)


    pathsToImages = getPaths(imgdir, isImage)

    if not len(pathsToImages):
        print("couldn't find any images")
        return

    for pathtoimage in pathsToImages:
        imageName = os.path.basename(pathtoimage)

        # label img has the same name as image only ends with .txt
        labelName = ''.join(imageName.split('.')[:-1]) + '.gt.txt'
        labelPath = os.path.join(imgdir, labelName)

        # skip labeled images
        if os.path.exists(labelPath):
            continue

        # read image
        image = cv2.imread(pathtoimage)
        if image is None:
            print("couldn't open the image")
            continue

        h, w = image.shape[:2]

        # resize to fixed size (only for visualization)
        hnew = HEIGHT
        wnew = int(w * hnew / h)

        image = cv2.resize(image, (wnew, hnew))

        cv2.imshow(WINDOWNAME, image)
        cv2.waitKey(1)

        print("enter what is written on the image or \
              press enter to skip or")
        label = input()

        if not len(label):
            continue


        with open(labelPath, 'w') as labelfile:
            labelfile.write(label)

    cv2.destroyAllWindows()


if __name__ == '__main__':
    imgdir = sys.argv[1]
    labelingProcess(imgdir)

对于这个特定的脚本要求是 opencv

用法:

python3 labelingtool.py <path to your folder with images>

它将从您的文件夹中读取图像并创建相应的带有注释的 .gt.txt 文件。在标记过程中,您可以在终端中键入注释。

进一步训练您自己的模型,您可以使用例如这个 repo https://github.com/thongvm/ocrd-train

它需要数据集是格式图像和相应的注释

image1.tif
image1.gt.txt 

image2.tif
image2.gt.txt 

...

要将图像转换为 .tif,您可以使用mogrify例如

此代码会将所有 jpg 文件转换为 tif 文件

mogrify -format tif *.jpg

推荐阅读