Does Inference Engine support non-square input images? - Page 2

zhang__chunyan · ‎07-29-2019

Hi，everyone

Does anyone know how to use keep aspect ratio resize of input to Faster RCNN model?

I have converted a Faster RCNN model by Model Optimizer, from the generated *.xml file, the input shape seems fixed:

And through the faster rcnn demo, I got totally wrong detection results:

[ INFO ] Loading model to the device
[ INFO ] Create infer request
[ INFO ] Batch size is 1
[ INFO ] Start inference
[ INFO ] Processing output blobs
[0,1] element, prob = 0.00185027 (0,-2147483648)-(0,-2147483648) batch id : 0
[1,1] element, prob = 0.00121647 (0,-2147483648)-(0,-2147483648) batch id : 0
[2,1] element, prob = 0.0011118 (0,-2147483648)-(0,-2147483648) batch id : 0
[3,1] element, prob = 0.00109558 (0,-2147483648)-(0,-2147483648) batch id : 0
[4,1] element, prob = 0.000270341 (0,-2147483648)-(0,-2147483648) batch id : 0
[5,1] element, prob = 0.000115265 (0,-2147483648)-(0,-2147483648) batch id : 0
[6,1] element, prob = 0.000103648 (0,-2147483648)-(0,-2147483648) batch id : 0
[7,1] element, prob = 0.000100398 (0,-2147483648)-(0,-2147483648) batch id : 0
[8,1] element, prob = 9.22909e-05 (0,-2147483648)-(0,-2147483648) batch id : 0
[9,1] element, prob = 8.88288e-05 (0,-2147483648)-(0,-2147483648) batch id : 0
[10,1] element, prob = 6.75321e-05 (0,-2147483648)-(0,-2147483648) batch id : 0
[11,1] element, prob = 6.50572e-05 (0,-2147483648)-(0,-2147483648) batch id : 0
[12,1] element, prob = 5.01635e-05 (0,-2147483648)-(0,-2147483648) batch id : 0

Then I used the parameter --input_shape [1,600,1024,3] to convert the model, I still got wrong results. And, the test images do not have the same shape, so I can't give a fixed input shape.

Best regards,

Zhang Chunyan

zhang__chunyan · ‎08-26-2019

Dear Shubha,

The attached four text files are detecting results of 2 images from TF demo and OpenVINO SSD demo. I'm sure the input is exactly the same.

For the file ScanTest00000003_03_tf.txt and ScanTest00000003_03_vino.txt, the detected bounding boxes are totally the same. So I 'm sure that OpenVINO demo will not reduce the detecion accuracy of the trained model.

But, for the file ScanTest00000003_01_tf.txt and ScanTest00000003_01_vino.txt, we can see that the number of detected boxes of OpenVINO SSD demo is less than that of TF demo. The only reason I can think of is that whether there's some threshold in OpenVINO Inference Engine influence the detect number, maybe there's some bugs in a for cyclic sentence?? Because I can not go into the Infer() function of Inference Engine. I would appreciate it if you could check the code.

Thank you.

Best regards,

zhang chunyan

Shubha_R_Intel · ‎08-27-2019

Dear zhang, chunyan,

Actually you can step through Infer(). All you have to do is build a Debug configuration of Open Source Inference Engine and step through the code. But the main reason you are suffering accuracy loss is because of a wrong Model Optimizer command. There is some pre-processing which goes into your model before training, stuff like mean subtraction, scaling, image size and so on. Do a --h on mo_tf.py and you will see all of the pre-processing options. Most likely you didn't properly tell Model Optimizer of the pre-processing involved with your model pre-training.

Also we have a great deal of documentation under C:\Program Files (x86)\IntelSWTools\openvino_2019.2.275\deployment_tools\open_model_zoo\models\public for many public models. Also we document the mean and scale values for many Tensorflow Models on the Model Optimizer Tensorflow Page .

I suspect that you're not telling Model Optimizer about your pre-processing, which is why you're getting bad results on your custom trained model.

Hope it helps,

Thanks,

Shubha

zhang__chunyan · ‎08-27-2019

Dear Shubha,

I'm sure there's no problem with my pre-processing. Maybe you misunderstood me. All my results are almost the same, even the confidence of TF demo and OpenVINO demo, you can see from the *.txt, the four decimal places of the confidence are the same. The results of one image (that has more than 50 objects) are not totally correct. So that's why I thought maybe some truncation exist. .

Thank you.

Best regards,

zhang chunyan

Shubha_R_Intel · ‎08-28-2019

Dear zhang, chunyan,

OK. Thanks for clarifying. I understand your problem better. Are you getting these results from running the object_detection_sample_ssd sample ? Because if that is the case, keep in mind - it's just a sample (meaning, it's not really intended to be used in production, it's more meant as a tutorial). However in the sample, I see this code:

        const SizeVector outputDims = outputInfo->getTensorDesc().getDims();

        const int maxProposalCount = outputDims[2];
        const int objectSize = outputDims[3];

Later I see this code:

 slog::info << "Processing output blobs" << slog::endl;

        const Blob::Ptr output_blob = infer_request.GetBlob(outputName);
        const float* detection = static_cast<PrecisionTrait<Precision::FP32>::value_type*>(output_blob->buffer());

        std::vector<std::vector<int> > boxes(batchSize);
        std::vector<std::vector<int> > classes(batchSize);

        /* Each detection has image_id that denotes processed image */
        for (int curProposal = 0; curProposal < maxProposalCount; curProposal++) {
            auto image_id = static_cast<int>(detection[curProposal * objectSize + 0]);
            if (image_id < 0) {
                break;
            }

            float confidence = detection[curProposal * objectSize + 2];
            auto label = static_cast<int>(detection[curProposal * objectSize + 1]);
            auto xmin = static_cast<int>(detection[curProposal * objectSize + 3] * imageWidths[image_id]);
            auto ymin = static_cast<int>(detection[curProposal * objectSize + 4] * imageHeights[image_id]);
            auto xmax = static_cast<int>(detection[curProposal * objectSize + 5] * imageWidths[image_id]);
            auto ymax = static_cast<int>(detection[curProposal * objectSize + 6] * imageHeights[image_id]);

            std::cout << "[" << curProposal << "," << label << "] element, prob = " << confidence <<
                "    (" << xmin << "," << ymin << ")-(" << xmax << "," << ymax << ")" << " batch id : " << image_id;

            if (confidence > 0.5) {
                /** Drawing only objects with >50% probability **/
                classes[image_id].push_back(label);
                boxes[image_id].push_back(xmin);
                boxes[image_id].push_back(ymin);
                boxes[image_id].push_back(xmax - xmin);
                boxes[image_id].push_back(ymax - ymin);
                std::cout << " WILL BE PRINTED!";
            }
            std::cout << std::endl;
        }

The only threshold I see is the confidence being > 0.5 and your data is well exceeding that so it shouldn't be a problem. The key information is what is happening here ?

const int maxProposalCount = outputDims[2];

Also here ?

const Blob::Ptr output_blob = infer_request.GetBlob(outputName);
        const float* detection = static_cast<PrecisionTrait<Precision::FP32>::value_type*>(output_blob->buffer());

Without debugging the code (stepping through the debugger) it would be very hard to figure out. I doubt that Inference Engine is placing artificial constraints on the number of detections though.

Hope it helps,

Shubha