首页 > 解决方案 > MaskRCNN 的 segm IoU 指标从何而来 = 0?

问题描述

在我的多类实例分割自定义数据集上训练MaskRCNN时,输入格式为:

image   -)  shape: torch.Size([3, 850, 600]),   dtype: torch.float32, min: tensor(0.0431),               max: tensor(0.9137)
boxes   -)  shape: torch.Size([4, 4]),          dtype: torch.float32, min: tensor(47.),                  max: tensor(807.)
masks   -)  shape: torch.Size([850, 600, 600]), dtype: torch.uint8,   min: tensor(0, dtype=torch.uint8), max: tensor(1, dtype=torch.uint8)
areas   -)  shape: torch.Size([4]),             dtype: torch.float32, min: tensor(1479.),                max: tensor(8014.)
labels  -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(1),                    max: tensor(1)
iscrowd -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(0),                    max: tensor(0)

我始终如一地获得所有分段IoU指标,如下所示:

DONE (t=0.03s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.004
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.010
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.004
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
IoU metric: segm
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

我该如何思考、调试和解决这个问题?

标签: pytorchtorch

解决方案


由于您的输入图像大小是 (850, 600) (H, W) 并且考虑到对于这个给定的图像,您有 4 个对象,而不是带有 (600, 600) 掩码的 850 个对象。你的面具张量应该有维度(对象数,850、600),因此你的输入应该是:

image   -)  shape: torch.Size([3, 850, 600]),   dtype: torch.float32, min: tensor(0.0431),               max: tensor(0.9137)
boxes   -)  shape: torch.Size([4, 4]),          dtype: torch.float32, min: tensor(47.),                  max: tensor(807.)
masks   -)  shape: torch.Size([4, 850, 600]), dtype: torch.uint8,   min: tensor(0, dtype=torch.uint8), max: tensor(1, dtype=torch.uint8)
areas   -)  shape: torch.Size([4]),             dtype: torch.float32, min: tensor(1479.),                max: tensor(8014.)
labels  -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(1),                    max: tensor(1)
iscrowd -)  shape: torch.Size([4]),             dtype: torch.int64,   min: tensor(0),                    max: tensor(0)

如何修复它因为您正在尝试解决实例分割问题,请确保您的每个 (850, 600) 个掩码都堆叠起来,以便产生一个 (number of mask, 850, 600) 形状的张量。


推荐阅读