首页 > 解决方案 > R-CNN 中的说明

问题描述

我正在学习使用 R-CNN 进行对象检测...

我有图像和注释文件,它给出了对象的边界框

我了解 R-CNN 中的这些步骤,

使用选择性搜索来获取建议的区域

使所有区域大小相同

在 CNN 中输入这些图像

保存特征图并馈送到 SVM 进行分类


在训练中,我将所有对象(仅来自图像的对象而不是背景)输入到 CNN,然后在 SVM 中训练特征图进行分类。

在每个博客中,都在说R-CNN,分为三个部分,1st -selective search 2nd -CNN 3rd -BBox Regression

但是,我没有得到 BBox 回归的深入解释。

我了解 IOU(Intercept over Union)来检查 BBox 的准确性。

您能否帮我了解如何使用此 BBox 回归来获取对象的坐标。

标签: pythontensorflowkerasconv-neural-network

解决方案


To explain about the BBox regression working which is as mentioned below.

Like you mentioned it happens in multiple stages.

  1. Selective Search:

    1. Generate initial sub-segmentation, we generate many candidates or part regions.
    2. Use greedy algorithm to recursively combine similar regions into larger ones.
    3. Use the generated regions to produce the final candidate region proposals.
  2. CNN and BBox Regression:

    The regressor is a CNN with convolutional layers, and fully connected layers, but in the last fully connected layer, it does not apply sigmoid or softmax, which is what is typically used in classification, as the values correspond to probabilities. Instead, what this CNN outputs are four values (,,ℎ,), where (,) specify the values of the position of the left corner and (ℎ,) the height and width of the window. In order to train this NN, the loss function will penalize when the outputs of the NN are very different from the labelled (,,ℎ,) in the training set.


推荐阅读