首页 > 解决方案 > Yolov3 没有检测到任何东西,但 Yolov2 工作正常

问题描述

我的应用程序的目的是检测人类。如果我正在加载 YOLOv2 权重和配置,一切都会很好。如果我正在加载 YOLOv3 权重和配置,net 在所有边界框上的所有类置信度上返回 0。我从https://pjreddie.com/darknet/yolo/尝试了普通的 YOLOv3-416 和 YOLOv3-tiny 。据我所知,YOLOv2 和 YOLOv3 上所需的输入和输出是相同的。请帮我找出我做错了什么,YOLOv3 不起作用。我正在使用 OpenCV 4.01 和 Java 包装器。我只使用CPU。我试图找到类似的问题,但我没有找到类似的东西。

public class YoloAnalizer {
private Net net;
private StopWatch stopWatch = new StopWatch();
private Logger logger = LogManager.getLogger();

private final double threshold = 0.5;
private final double scaleFactor = 1.0 / 255.000;
private final Size imageSize = new Size(416, 416);
private final Scalar mean = new Scalar(0,0,0);
private final boolean swapRB = true;
private final boolean crop = false;

private final String[] classes = new String[] {"person", "bicycle", "car", "motorcycle",
                                             "airplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant",
                                             "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse",
                                             "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack",
                                             "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis",
                                             "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard",
                                             "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife",
                                             "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog",
                                             "pizza", "donut", "cake", "chair", "couch", "potted plant", "bed", "dining table",
                                             "toilet", "tv", "laptop", "mouse", "remote", "keyboard",
                                             "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator",
                                 "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"};

public YoloAnalizer(String pathToYoloDarknetConfig, String pathToYoloDarknetWeights) {
    net = Dnn.readNetFromDarknet(pathToYoloDarknetConfig, pathToYoloDarknetWeights);
}

public List<Rect> AnalizeImage(Mat image) {
    logger.debug("Starting analisic image using yolo");
    stopWatch.StartTime();
    Mat blob = Dnn.blobFromImage(image, scaleFactor, imageSize, mean, swapRB, crop);
    net.setInput(blob);

    Mat prediction = net.forward();
    List<Rect> rects = ConvertPredictionToRoundingBox(prediction, image);
    logger.debug(String.format("Analising frame took: %s", stopWatch.GetElapsedMiliseconds()));
    return rects;
}

private List<Rect> ConvertPredictionToRoundingBox(Mat prediction, Mat image) {
    List<Rect> listOfPredictedObjects = new ArrayList<>();
    for (int i = 0; i < prediction.size().height; i++) {
        float[] row = new float[85];
        prediction.get(i, 0, row);

        float confidenceOnBox = row[4];
        int predictedClassConfidence = getTableIndexWithMaxValue(row, 5);
        double score = confidenceOnBox * row[predictedClassConfidence];
        if (score > threshold) {
            double x_center   = row[0] * image.width();
            double y_center   = row[1] * image.height();
            double width = row[2] * image.width();
            double height = row[3] * image.height();

            double left  = x_center - width * 0.5;
            double top  = y_center - height * 0.5;

            listOfPredictedObjects.add(new Rect((int)left, (int)top, (int)width, (int)height));
            logger.info(String.format("Found %s(%s) with confidence %s", classes[predictedClassConfidence-5],predictedClassConfidence, score));
        }
    }
    return listOfPredictedObjects;
}

private int getTableIndexWithMaxValue(float[] array, int startFrom) {
    double maxValue = -1;
    int maxIndex = -1;
    for (int i = startFrom; i < array.length; i++) {
        if (maxValue < array[i]) {
            maxIndex = i;
            maxValue = array[i];
        }
    }
    return maxIndex;
}

}

标签: javaopencvyolo

解决方案


这是我在 v3 中发现的:

在函数中fill_truth_region

真值表的创建格式为“1-classes-xywh”,即真值表中的每个条目是1+类数+4。

但是在forward_yolo_layer函数中,似乎得到框真值将从条目的开头取 x,y,w,h .

我想如果你改变这个forward_yolo_layer

box truth=float_to_box(net.truth + t * 5 + b * l.truths, 1);

对此:

box truth=float_to_box(net.truth + t * (5+l.classes) + b * l.truths + l.classes+1, 1);

然后你会得到一个带有正确 x,y,w,h 的真值框。


推荐阅读