首页 > 解决方案 > 在使用python的图像中围绕整个文本块绘制边界框

问题描述

我有图像,我已经消除了噪音(背景中的点),并且我想在图像中的文本块周围绘制一个边界框我如何使用 python OpenCV 来做到这一点

输入图像

去噪图像

这是用于消除背景噪音的代码,我可以在其中更改以保存带有文本周围的边界框的图像

import cv2
import matplotlib.pyplot as plt
import glob
import os
def remove_dots(image_path,outdir):
    image = cv2.imread(image_path)
    mask = np.zeros(image.shape, dtype=np.uint8)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (3,3), 0)
    thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,51,9)

    # Create horizontal kernel then dilate to connect text contours
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
    dilate = cv2.dilate(thresh, kernel, iterations=2)

    # Find contours and filter out noise using contour approximation and area filtering
    cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        peri = cv2.arcLength(c, True)
        approx = cv2.approxPolyDP(c, 0.04 * peri, True)
        x,y,w,h = cv2.boundingRect(c)
        area = w * h
        ar = w / float(h)
        if area > 1200 and area < 50000 and ar <8:
            cv2.drawContours(mask, [c], -1, (255,255,255), -1)
    # Bitwise-and input image and mask to get result
    mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
    result = cv2.bitwise_and(image, image, mask=mask)
    result[mask==0] = (255,255,255) # Color background white


    cv2.imwrite(os.path.join(outdir,os.path.basename(image_path)),result)
    
for jpgfile in glob.glob(r'C:\custom\TableDetectionWork\text_detection_dataset/*'):
    print(jpgfile)
    remove_dots(jpgfile,r'C:\custom\TableDetectionWork\textdetect/')

标签: pythonopencvimage-processingcomputer-visionopencv-contour

解决方案


您可以通过使用水平形态过滤器来合并蒙版图像中的字母来做到这一点。然后找到轮廓。然后得到边界框。

输入:

在此处输入图像描述

import cv2
import numpy as np

img = cv2.imread("john.jpg")

# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# threshold
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]

# invert
thresh = 255 - thresh

# apply horizontal morphology close
kernel = np.ones((5 ,191), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# get external contours
contours = cv2.findContours(morph, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# draw contours
result = img.copy()
for cntr in contours:
    # get bounding boxes
    pad = 10
    x,y,w,h = cv2.boundingRect(cntr)
    cv2.rectangle(result, (x-pad, y-pad), (x+w+pad, y+h+pad), (0, 0, 255), 4)

# save result
cv2.imwrite("john_bbox.png",result)

# display result
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

形态闭合图像:

在此处输入图像描述

边界框图像:

在此处输入图像描述


推荐阅读