python-3.x - 缩小图像中的边界框
问题描述
我有一个表格的图像(Image.size = (328, 231)),该图像上的文本边界框存储在字典中。
字典的内容是
{"bounding_boxes": [{"class": "text", "box": [0.0, 0.0, 53.94900093453785, 18.455579291781646]}, {"class": "text", "box": [53.94900093453785, 0.0, 184.96800320412976, 51.735807141009545]}, {"class": "text", "box": [184.96800320412976, 0.0, 256.199368074407, 51.735807141009545]}, {"class": "text", "box": [256.199368074407, 0.0, 328.0, 51.735807141009545]}, {"class": "text", "box": [0.0, 18.455579291781646, 53.94900093453785, 76.14130756377666]}, {"class": "text", "box": [53.94900093453785, 51.735807141009545, 184.96800320412976, 76.14130756377666]}, {"class": "text", "box": [184.96800320412976, 51.735807141009545, 256.199368074407, 76.14130756377666]}, {"class": "text", "box": [256.199368074407, 51.735807141009545, 328.0, 76.14130756377666]}, {"class": "text", "box": [0.0, 76.14130756377666, 53.94900093453785, 105.33448988766077]}, {"class": "text", "box": [53.94900093453785, 76.14130756377666, 184.96800320412976, 115.66887643031575]}, {"class": "text", "box": [184.96800320412976, 76.14130756377666, 256.199368074407, 115.66887643031575]}, {"class": "text", "box": [256.199368074407, 76.14130756377666, 328.0, 115.66887643031575]}, {"class": "text", "box": [0.0, 105.33448988766077, 53.94900093453785, 146.28279243942194]}, {"class": "text", "box": [53.94900093453785, 115.66887643031575, 184.96800320412976, 146.28279243942194]}, {"class": "text", "box": [184.96800320412976, 115.66887643031575, 256.199368074407, 146.28279243942194]}, {"class": "text", "box": [256.199368074407, 115.66887643031575, 328.0, 146.28279243942194]}, {"class": "text", "box": [0.0, 146.28279243942194, 53.94900093453785, 175.9430656804882]}, {"class": "text", "box": [53.94900093453785, 146.28279243942194, 184.96800320412976, 201.86661158409726]}, {"class": "text", "box": [184.96800320412976, 146.28279243942194, 256.199368074407, 201.86661158409726]}, {"class": "text", "box": [256.199368074407, 146.28279243942194, 328.0, 201.86661158409726]}, {"class": "text", "box": [0.0, 175.9430656804882, 53.94900093453785, 231.0]}, {"class": "text", "box": [53.94900093453785, 201.86661158409726, 184.96800320412976, 219.67445280166658]}, {"class": "text", "box": [184.96800320412976, 201.86661158409726, 256.199368074407, 219.67445280166658]}, {"class": "text", "box": [256.199368074407, 201.86661158409726, 328.0, 219.67445280166658]}, {"class": "text", "box": [53.94900093453785, 219.67445280166658, 184.96800320412976, 231.0]}, {"class": "text", "box": [184.96800320412976, 219.67445280166658, 256.199368074407, 231.0]}, {"class": "text", "box": [256.199368074407, 219.67445280166658, 328.0, 231.0]}]}
但是,我想缩小这些框,使它们紧紧地覆盖文本,如下图所示
我已经尝试使用一些图像阈值处理 cv2.boundingRect() 但没有成功。(来源如何从opencv中的图像中删除多余的空格?)
data = <<The dictionary from above>>
new_boxes = list()
img = cv2.imread('sam.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 128).astype(np.uint8)
for item in data["bounding_boxes"]:
xmin = int(item['box'][0])
ymin = int(item['box'][1])
xmax = int(item['box'][2])
ymax = int(item['box'][3])
crop_img = gray[ymin:ymax, xmin:xmax]
coords = cv2.findNonZero(crop_img)
x, y, w, h = cv2.boundingRect(coords)
new_box = [x, y, w+x, h+y]
new_boxes.append(new_box)
欢迎任何建议。谢谢!
解决方案
推荐阅读
- python - 修改字典中的值,添加数量
- git - 本地化远程分支的别名
- javascript - 如何在元素中添加样式 javascript
- php - php外部站点登录然后转到另一个页面
- postgresql - HikariCP 连接池 - '活动' - 如何调试?
- frontend - 从右到左实现 css lebal - 无法将其放置在正确的位置
- c# - 如何为新成员添加角色不和谐
- cryptocurrency - 通过编程创建的加密货币钱包并对其进行验证?
- windows - 如何防止 Shift+箭头使用自动热键在 Windows 中选择文本
- cdi - 如果 bean 没有无参数构造函数,如何将 @Normal (@ApplicationScoped) bean 注入 @Dependent 范围