Solved: Batching in void ParseYOLOV3Output(InferenceEngine::InferRequest::Ptr req..)

Patel · ‎06-29-2021

In the given function of Multi channel demos, we have batch size as an configurable value for the model. The inference fps changes on modification of batch size. But, the function of post processing the outputs doesn't take care about the batch's of images/Or is it?

Can someone explain the functionality of ParseYOLOV3Output? If it uses batching in this demo example?

void ParseYOLOV3Output(InferenceEngine::InferRequest::Ptr req,
                       const std::string &outputName,
                       const YoloParams &yoloParams, const unsigned long resized_im_h,
                       const unsigned long resized_im_w, const unsigned long original_im_h,
                       const unsigned long original_im_w,
                       const double threshold, std::vector<DetectionObject> &objects) {
    InferenceEngine::Blob::Ptr blob = req->GetBlob(outputName);

    const int out_blob_h = static_cast<int>(blob->getTensorDesc().getDims()[2]);
    const int out_blob_w = static_cast<int>(blob->getTensorDesc().getDims()[3]);
    if (out_blob_h != out_blob_w)
        throw std::runtime_error("Invalid size of output. It should be in NCHW layout and H should be equal to W. Current H = " + std::to_string(out_blob_h) +
        ", current W = " + std::to_string(out_blob_h));

    auto num = yoloParams.num;
    auto coords = yoloParams.coords;
    auto classes = yoloParams.classes;

    auto anchors = yoloParams.anchors;

    auto side = out_blob_h;
    auto side_square = side * side;
    InferenceEngine::LockedMemory<const void> blobMapped = InferenceEngine::as<InferenceEngine::MemoryBlob>(blob)->rmap();
    const float *output_blob  = blobMapped.as<float *>();
    // --------------------------- Parsing YOLO Region output -------------------------------------
    for (int i = 0; i < side_square; ++i) {
        int row = i / side;
        int col = i % side;
        for (int n = 0; n < num; ++n) {
            int obj_index = EntryIndex(side, coords, classes, n * side * side + i, coords);
            int box_index = EntryIndex(side, coords, classes, n * side * side + i, 0);
            float scale = output_blob[obj_index];
            if (scale < threshold)
                continue;
            double x = (col + output_blob[box_index + 0 * side_square]) / side * resized_im_w;
            double y = (row + output_blob[box_index + 1 * side_square]) / side * resized_im_h;
            double height = std::exp(output_blob[box_index + 3 * side_square]) * anchors[2 * n + 1];
            double width = std::exp(output_blob[box_index + 2 * side_square]) * anchors[2 * n];
            for (int j = 0; j < classes; ++j) {
                int class_index = EntryIndex(side, coords, classes, n * side_square + i, coords + 1 + j);
                float prob = scale * output_blob[class_index];
                if (prob < threshold)
                    continue;
                DetectionObject obj(x, y, height, width, j, prob,
                        static_cast<float>(original_im_h) / static_cast<float>(resized_im_h),
                        static_cast<float>(original_im_w) / static_cast<float>(resized_im_w));
                objects.push_back(obj);
            }
        }
    }
}

Patel · ‎07-22-2021

There is way for doing batching in openvino - we have to make changes in parse yolov3 according to this

https://github.com/pjreddie/darknet/blob/master/src/yolo_layer.c#L125

https://github.com/pjreddie/darknet/blob/master/src/yolo_layer.c#L316

View solution in original post

Zulkifli_Intel · ‎07-01-2021

Hello Prince Patel,

Thank you for reaching out.

We are checking on the information of functionality for ParseYOLOV3Output and get back to you soon.

Sincerely,

Zulkifli

Zulkifli_Intel · ‎07-05-2021

Hello Prince Patel,

Regarding your first question, the inference does take batch size from the blob as shown in this line:

InferenceEngine::Blob::Ptr blob = req->GetBlob(outputName);

const int out_blob_h = static_cast<int>(blob->getTensorDesc().getDims()[2]);

const int out_blob_w = static_cast<int>(blob->getTensorDesc().getDims()[3]);

Basically, before inferring, the code tends to prepare the input in blob size with all the configurations you provided such as NCHW.

For more information about batch size, and preparation input, you can refer to this documentation on the integration step.

For the ParseYOLOV3Output function, it performs the inference by detecting objects.

Sincerely,

Zulkifli

Patel · ‎07-05-2021

Hello Zulkifli,

Thanks for the reply.

const int out_blob_n = static_cast<int>(blob->getTensorDesc().getDims()[0]);

The out_blob_n is batch size here.

Can you explain the below particular section of code. And how we can grab objects for each batch index.

const float *output_blob  = blobMapped.as<float *>();
    // --------------------------- Parsing YOLO Region output -------------------------------------
    for (int i = 0; i < side_square; ++i) {
        int row = i / side;
        int col = i % side;
        for (int n = 0; n < num; ++n) {
            int obj_index = EntryIndex(side, coords, classes, n * side * side + i, coords);
            int box_index = EntryIndex(side, coords, classes, n * side * side + i, 0);
            float scale = output_blob[obj_index];
            if (scale < threshold)
                continue;
            double x = (col + output_blob[box_index + 0 * side_square]) / side * resized_im_w;
            double y = (row + output_blob[box_index + 1 * side_square]) / side * resized_im_h;
            double height = std::exp(output_blob[box_index + 3 * side_square]) * anchors[2 * n + 1];
            double width = std::exp(output_blob[box_index + 2 * side_square]) * anchors[2 * n];
            for (int j = 0; j < classes; ++j) {
                int class_index = EntryIndex(side, coords, classes, n * side_square + i, coords + 1 + j);
                float prob = scale * output_blob[class_index];
                if (prob < threshold)
                    continue;
                DetectionObject obj(x, y, height, width, j, prob,
                        static_cast<float>(original_im_h) / static_cast<float>(resized_im_h),
                        static_cast<float>(original_im_w) / static_cast<float>(resized_im_w));
                objects.push_back(obj);
            }
        }
    }

Zulkifli_Intel · ‎07-08-2021

Hello Prince Patel,

The code was developed by the developer. Basically, that portion of code will create the grid based on blob size, then each box from the grid will be detected by a function call DetectionObject. DetectionObject will return the confidence level and class id of the objects based on the yolov3 model. If the objects are detected with high confidence, then the box will draw and display out. This process is continued until all the boxes in the grid are completely analyzed.

For more detail on how the box on each grid is selected, you can refer to this paper.

Sincerely,

Zulkifli

Zulkifli_Intel · ‎07-21-2021

Hello Prince Patel,

This thread will no longer be monitored since we have provided the information. If you need any additional information from Intel, please submit a new question.

Sincerely,

Zulkifli

Patel · ‎07-22-2021

There is way for doing batching in openvino - we have to make changes in parse yolov3 according to this

https://github.com/pjreddie/darknet/blob/master/src/yolo_layer.c#L125

https://github.com/pjreddie/darknet/blob/master/src/yolo_layer.c#L316

Zulkifli_Intel · ‎07-30-2021

Hello Prince Patel.

Thank you for sharing your solution.

Sincerely,

Zulkifli

Batching in void ParseYOLOV3Output(InferenceEngine::InferRequest::Ptr req..)

Code Samples

Documentation