Yolov3-tiny model was reasoned on raspberry pie and NCS2, and a large number of disorderly boxes were drawn


I followed the official website tutorial and built an OpenVino environment on Windows 10 with a toolkit of R1. I downloaded yolov3-tiny. weights from the coco dataset and successfully converted the tensorflow model to the IR model on Windows without any errors (tensorflow version 1.12.0). However, when I used raspberry pie and NCS2 to call bin files and XML to test a single picture, a lot of boxes were drawn on the target picture without any rules. I've spent a lot of time looking for reasons. I really don't know where the problem was lost.

The test program on raspberry pie is as follows:

import cv2 as cv
frame = cv.imread('ch2_1.jpg')
for detection in out.reshape(-1,7):
    if confidence>0.5:

Dear hongwang, cao,

First, rather than write your own code, why not try our Yolo V3 sample ? It also works with Tiny yolo v3.

Also please read this, messed up detection boxes is a common problem with darknet:

this is likely an image-preprocessing issue. What did the model optimizer log say the input shape was ? 416x416 for W and H  ? What size image was passed in ? When you perform the Model Optimizer command, you have these options. You have to understand how yolo v3 tiny was trained in the first place (this has nothing to do with OpenVino) to understand the pre-processing which needs to happen. There is no way for Model Optimizer to know these options - you must tell them to Model Optimizer. 

--input_shape INPUT_SHAPE

                        Input shape(s) that should be fed to an input node(s)

                        of the model. Shape is defined as a comma-separated

                        list of integer numbers enclosed in parentheses or

                        square brackets, for example [1,3,227,227] or

                        (1,227,227,3), where the order of dimensions depends

                        on the framework input layout of the model. For

                        example, [N,C,H,W] is used for Caffe* models and

                        [N,H,W,C] for TensorFlow* models. Model Optimizer

                        performs necessary transformations to convert the

                        shape to the layout required by Inference Engine

                        (N,C,H,W). The shape should not contain undefined

                        dimensions (? or -1) and should fit the dimensions

                        defined in the input operation of the graph. If there

                        are multiple inputs in the model, --input_shape should

                        contain definition of shape for each input separated

                        by a comma, for example: [1,3,227,227],[2,4] for a

                        model with two inputs with 4D and 2D shapes.


--scale SCALE, -s SCALE

                        All input values coming from original network inputs

                        will be divided by this value. When a list of inputs

                        is overridden by the --input parameter, this scale is

                        not applied for any input that does not match with the

                        original input of the model.




                        Switch the input channels order from RGB to BGR (or

                        vice versa). Applied to original inputs of the model

                        if and only if a number of channels equals 3. Applied

                        after application of --mean_values and --scale_values

                        options, so numbers in --mean_values and

                        --scale_values go in the order of channels used in the

                        original model.


--mean_values MEAN_VALUES, -ms MEAN_VALUES

                        Mean values to be used for the input image per

                        channel. Values to be provided in the (R,G,B) or

                        [R,G,B] format. Can be defined for desired input of

                        the model, for example: "--mean_values

                        data[255,255,255],info[255,255,255]". The exact

                        meaning and order of channels depend on how the

                        original model was trained.

  --scale_values SCALE_VALUES

                        Scale values to be used for the input image per

                        channel. Values are provided in the (R,G,B) or [R,G,B]

                        format. Can be defined for desired input of the model,

                        for example: "--scale_values

                        data[255,255,255],info[255,255,255]". The exact

                        meaning and order of channels depend on how the

                        original model was trained.

Also please read this document:

Important Notes About Feeding Input Images

Lastly, as the Model Optimizer document explains, it is expected that you may have to change some of the values in yolo_v3_tiny.json. They are not set in stone. You may have to tweak the values. If you google darknet "mask", "anchors", will get answers from the Internet. 

Thanks for using OpenVino and I hope it helps,


