Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
130 Views

Does Inference Engine support non-square input images?

Hi,everyone

 

Does anyone know how to use keep aspect ratio resize of input to Faster RCNN model? 

I have converted a Faster RCNN model by Model Optimizer, from the generated *.xml file, the input shape seems fixed:

<layer id="0" name="image_tensor" precision="FP32" type="Input">
            <output>
                <port id="0">
                    <dim>1</dim>
                    <dim>3</dim>
                    <dim>600</dim>
                    <dim>600</dim>
                </port>
            </output>
        </layer>

And through the faster rcnn demo, I got totally wrong detection results:

[ INFO ] Loading model to the device
[ INFO ] Create infer request
[ INFO ] Batch size is 1
[ INFO ] Start inference
[ INFO ] Processing output blobs
[0,1] element, prob = 0.00185027    (0,-2147483648)-(0,-2147483648) batch id : 0
[1,1] element, prob = 0.00121647    (0,-2147483648)-(0,-2147483648) batch id : 0
[2,1] element, prob = 0.0011118    (0,-2147483648)-(0,-2147483648) batch id : 0
[3,1] element, prob = 0.00109558    (0,-2147483648)-(0,-2147483648) batch id : 0
[4,1] element, prob = 0.000270341    (0,-2147483648)-(0,-2147483648) batch id : 0
[5,1] element, prob = 0.000115265    (0,-2147483648)-(0,-2147483648) batch id : 0
[6,1] element, prob = 0.000103648    (0,-2147483648)-(0,-2147483648) batch id : 0
[7,1] element, prob = 0.000100398    (0,-2147483648)-(0,-2147483648) batch id : 0
[8,1] element, prob = 9.22909e-05    (0,-2147483648)-(0,-2147483648) batch id : 0
[9,1] element, prob = 8.88288e-05    (0,-2147483648)-(0,-2147483648) batch id : 0
[10,1] element, prob = 6.75321e-05    (0,-2147483648)-(0,-2147483648) batch id : 0
[11,1] element, prob = 6.50572e-05    (0,-2147483648)-(0,-2147483648) batch id : 0
[12,1] element, prob = 5.01635e-05    (0,-2147483648)-(0,-2147483648) batch id : 0

Then  I used the parameter --input_shape [1,600,1024,3] to convert the model, I still got wrong results. And, the test images do not have the same shape, so I can't give a fixed input shape.

 

Best regards,

Zhang Chunyan

 

 

0 Kudos
24 Replies
Highlighted
Beginner
20 Views

Dear Shubha,

 

I have solved the keep aspect ratio resize problem by some process before feed the data to input blob of inference engine.

But the recall of my testing drops compared to the tensorflow demo, I have compared all the output bounding boxes of TF and OpenVINO SSD demo, all the coordinates of the detected boxes are the same except for one image. I think the reason maybe is that in this image there are too many objects, and in other images, the number of objects is less than 10 .And in this image, TF demo detected 52 objects, while OpenVINO SSD demo just detected 37 objects. So I want to ask if there is a threshold in the  OpenVINO SSD demo or the inference function to avoid detecting too many objects? I have went through the SSD demo code, and I have no idea about this..

Thank you.

Best regards,

zhang chunyan

0 Kudos
Highlighted
Beginner
20 Views

Dear Shubha,

 

The attached four text files are detecting results of 2 images from TF demo and OpenVINO SSD demo. I'm sure the input is exactly the same. 

For the file ScanTest00000003_03_tf.txt and ScanTest00000003_03_vino.txt, the detected bounding boxes are totally the same. So I 'm sure that OpenVINO demo will not reduce the detecion accuracy of the trained model.

But, for the file ScanTest00000003_01_tf.txt and ScanTest00000003_01_vino.txt, we can see that the number of detected boxes of OpenVINO SSD demo is less than that of TF demo. The only reason I can think of is that whether there's some threshold in OpenVINO Inference Engine influence the detect number, maybe there's some bugs in a for cyclic sentence?? Because I can not go into the Infer() function of Inference Engine. I would appreciate it if you could check the code.

Thank you.

Best regards,

zhang chunyan

0 Kudos
Highlighted
Employee
20 Views

Dear zhang, chunyan,

Actually you can step through Infer(). All you have to do is build a Debug configuration of Open Source Inference Engine and step through the code. But the main reason you are suffering accuracy loss is because of a wrong Model Optimizer command. There is some pre-processing which goes into your model before training, stuff like mean subtraction, scaling, image size and so on. Do a --h on mo_tf.py and you will see all of the pre-processing options. Most likely you didn't properly tell Model Optimizer of the pre-processing involved with your model pre-training.

Also we have a great deal of documentation under C:\Program Files (x86)\IntelSWTools\openvino_2019.2.275\deployment_tools\open_model_zoo\models\public for many public models.  Also we document the mean and scale values for many Tensorflow Models on the Model Optimizer Tensorflow Page . 

I suspect that you're not telling Model Optimizer about your pre-processing, which is why you're getting bad results on your custom trained model.

Hope it helps,

Thanks,

Shubha

0 Kudos
Highlighted
Beginner
20 Views

Dear Shubha,

 

I'm sure there's no problem with my pre-processing. Maybe you misunderstood  me. All my results are almost the same, even the confidence of TF demo and OpenVINO demo, you can see from the *.txt, the four decimal places of the confidence are the same. The results of one image (that has more than 50 objects) are not totally correct. So that's why I thought maybe some truncation exist. .

 

Thank you.

Best regards,

zhang chunyan

0 Kudos
Highlighted
Employee
20 Views

Dear zhang, chunyan,

OK. Thanks for clarifying. I understand your problem better.  Are you getting these results from running the object_detection_sample_ssd sample ? Because if that is the case, keep in mind - it's just a sample (meaning, it's not really intended to be used in production, it's more meant as a tutorial). However in the sample, I see this code:

        const SizeVector outputDims = outputInfo->getTensorDesc().getDims();

        const int maxProposalCount = outputDims[2];
        const int objectSize = outputDims[3];

 

Later I see this code:

 slog::info << "Processing output blobs" << slog::endl;

        const Blob::Ptr output_blob = infer_request.GetBlob(outputName);
        const float* detection = static_cast<PrecisionTrait<Precision::FP32>::value_type*>(output_blob->buffer());

        std::vector<std::vector<int> > boxes(batchSize);
        std::vector<std::vector<int> > classes(batchSize);

        /* Each detection has image_id that denotes processed image */
        for (int curProposal = 0; curProposal < maxProposalCount; curProposal++) {
            auto image_id = static_cast<int>(detection[curProposal * objectSize + 0]);
            if (image_id < 0) {
                break;
            }

            float confidence = detection[curProposal * objectSize + 2];
            auto label = static_cast<int>(detection[curProposal * objectSize + 1]);
            auto xmin = static_cast<int>(detection[curProposal * objectSize + 3] * imageWidths[image_id]);
            auto ymin = static_cast<int>(detection[curProposal * objectSize + 4] * imageHeights[image_id]);
            auto xmax = static_cast<int>(detection[curProposal * objectSize + 5] * imageWidths[image_id]);
            auto ymax = static_cast<int>(detection[curProposal * objectSize + 6] * imageHeights[image_id]);

            std::cout << "[" << curProposal << "," << label << "] element, prob = " << confidence <<
                "    (" << xmin << "," << ymin << ")-(" << xmax << "," << ymax << ")" << " batch id : " << image_id;

            if (confidence > 0.5) {
                /** Drawing only objects with >50% probability **/
                classes[image_id].push_back(label);
                boxes[image_id].push_back(xmin);
                boxes[image_id].push_back(ymin);
                boxes[image_id].push_back(xmax - xmin);
                boxes[image_id].push_back(ymax - ymin);
                std::cout << " WILL BE PRINTED!";
            }
            std::cout << std::endl;
        }

 

The only threshold I see is the confidence being > 0.5 and your data is well exceeding that so it shouldn't be a problem. The key information is what is happening here ?

const int maxProposalCount = outputDims[2];

Also here ?

const Blob::Ptr output_blob = infer_request.GetBlob(outputName);
        const float* detection = static_cast<PrecisionTrait<Precision::FP32>::value_type*>(output_blob->buffer());
 

Without debugging  the code (stepping through the debugger) it would be very hard to figure out. I doubt that Inference Engine is placing artificial constraints on the number of detections though.

Hope it helps,

Shubha

0 Kudos