首页 > 解决方案 > 缩小图像中的边界框

问题描述

我有一个表格的图像(Image.size = (328, 231)),该图像上的文本边界框存储在字典中。

表的图像

字典的内容是

{"bounding_boxes": [{"class": "text", "box": [0.0, 0.0, 53.94900093453785, 18.455579291781646]}, {"class": "text", "box": [53.94900093453785, 0.0, 184.96800320412976, 51.735807141009545]}, {"class": "text", "box": [184.96800320412976, 0.0, 256.199368074407, 51.735807141009545]}, {"class": "text", "box": [256.199368074407, 0.0, 328.0, 51.735807141009545]}, {"class": "text", "box": [0.0, 18.455579291781646, 53.94900093453785, 76.14130756377666]}, {"class": "text", "box": [53.94900093453785, 51.735807141009545, 184.96800320412976, 76.14130756377666]}, {"class": "text", "box": [184.96800320412976, 51.735807141009545, 256.199368074407, 76.14130756377666]}, {"class": "text", "box": [256.199368074407, 51.735807141009545, 328.0, 76.14130756377666]}, {"class": "text", "box": [0.0, 76.14130756377666, 53.94900093453785, 105.33448988766077]}, {"class": "text", "box": [53.94900093453785, 76.14130756377666, 184.96800320412976, 115.66887643031575]}, {"class": "text", "box": [184.96800320412976, 76.14130756377666, 256.199368074407, 115.66887643031575]}, {"class": "text", "box": [256.199368074407, 76.14130756377666, 328.0, 115.66887643031575]}, {"class": "text", "box": [0.0, 105.33448988766077, 53.94900093453785, 146.28279243942194]}, {"class": "text", "box": [53.94900093453785, 115.66887643031575, 184.96800320412976, 146.28279243942194]}, {"class": "text", "box": [184.96800320412976, 115.66887643031575, 256.199368074407, 146.28279243942194]}, {"class": "text", "box": [256.199368074407, 115.66887643031575, 328.0, 146.28279243942194]}, {"class": "text", "box": [0.0, 146.28279243942194, 53.94900093453785, 175.9430656804882]}, {"class": "text", "box": [53.94900093453785, 146.28279243942194, 184.96800320412976, 201.86661158409726]}, {"class": "text", "box": [184.96800320412976, 146.28279243942194, 256.199368074407, 201.86661158409726]}, {"class": "text", "box": [256.199368074407, 146.28279243942194, 328.0, 201.86661158409726]}, {"class": "text", "box": [0.0, 175.9430656804882, 53.94900093453785, 231.0]}, {"class": "text", "box": [53.94900093453785, 201.86661158409726, 184.96800320412976, 219.67445280166658]}, {"class": "text", "box": [184.96800320412976, 201.86661158409726, 256.199368074407, 219.67445280166658]}, {"class": "text", "box": [256.199368074407, 201.86661158409726, 328.0, 219.67445280166658]}, {"class": "text", "box": [53.94900093453785, 219.67445280166658, 184.96800320412976, 231.0]}, {"class": "text", "box": [184.96800320412976, 219.67445280166658, 256.199368074407, 231.0]}, {"class": "text", "box": [256.199368074407, 219.67445280166658, 328.0, 231.0]}]}

在给定的 JSON 数据中绘制边界框后,图像如下所示, 用边界框绘制的表格图像

但是,我想缩小这些框,使它们紧紧地覆盖文本,如下图所示

在此处输入图像描述

我已经尝试使用一些图像阈值处理 cv2.boundingRect() 但没有成功。(来源如何从opencv中的图像中删除多余的空格?

data = <<The dictionary from above>>
new_boxes = list()
img = cv2.imread('sam.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 128).astype(np.uint8)
for item in data["bounding_boxes"]:
    xmin = int(item['box'][0])
    ymin = int(item['box'][1])
    xmax = int(item['box'][2])
    ymax = int(item['box'][3])
    crop_img = gray[ymin:ymax, xmin:xmax]
    coords = cv2.findNonZero(crop_img)
    x, y, w, h = cv2.boundingRect(coords)
    new_box = [x, y, w+x, h+y]
    new_boxes.append(new_box)

欢迎任何建议。谢谢!

标签: python-3.xcomputer-visionpython-imaging-librarycv2bounding-box

解决方案


推荐阅读