Re: Trouble Understanding what a Blob is/how to extract info from a blob

dominicfmazza · ‎03-25-2021

Hello,

I am attempting to use the OpenVINO toolkit to run inference on a custom-trained YoloV4 model. I am struggling to understand what a Blob is and how to extract detection data from one.

I trained the model using darknet, then converted the model to a tensorflow saved model and subsequently an IR using this repo https://github.com/TNTWEN/OpenVINO-YOLOV4.

From there, all I have done thus far is try to implement a basic working example using the tutorial done here: https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_Integrate_with_customer_application_new_API.html .

In my debugging, I found that the output of the inference I was doing was 3 separate blobs of sizes 13x13x255, 26x26x255, and 52x52x255 respectively, and they were filled with floats. Attached is the IR and the .cpp file initializing the engine and running the inference. I am struggling to understand why the outputs are so large, and I'm thinking I must have missed something, as I have no idea how to turn the data outputted into the detections I am looking for.

#include <ros/ros.h>
#include <cv_bridge/cv_bridge.h>
#include "inference_engine.hpp"
#include "image_transport/image_transport.h"
#include "sensor_msgs/image_encodings.h"
#include "sensor_msgs/Image.h"
#include "std_msgs/String.h"
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <string>

const std::string modelBin = "/home/mazzadom/vehicle/src/perception/camera/yolov4-openvino/frozen_darknet_yolov4_model.bin";
const std::string modelXml = "/home/mazzadom/vehicle/src/perception/camera/yolov4-openvino/frozen_darknet_yolov4_model.xml";
class YoloV4OpenVINO
{
  ros::NodeHandle mNh;
  image_transport::ImageTransport it;
  image_transport::Subscriber sub;
  ros::Publisher test_pub = mNh.advertise<std_msgs::String>("test_pub", 1000);
  InferenceEngine::Core core;
  InferenceEngine::CNNNetwork network;
  InferenceEngine::ExecutableNetwork executable_network;
  InferenceEngine::InputsDataMap input_info;
  InferenceEngine::OutputsDataMap output_info;
  InferenceEngine::ExecutableNetwork exec_network;


  public:
    template <typename T>
    void matU8ToBlob(const cv::Mat& orig_image, InferenceEngine::Blob::Ptr& blob, float scaleFactor = 1.0, int batchIndex = 0) {
        InferenceEngine::SizeVector blobSize = blob->getTensorDesc().getDims();
        const size_t width = blobSize[3];
        const size_t height = blobSize[2];
        const size_t channels = blobSize[1];
        T* blob_data = blob->buffer().as<T*>();

        cv::Mat resized_image(orig_image);
        if (width != orig_image.size().width || height!= orig_image.size().height) {
            cv::resize(orig_image, resized_image, cv::Size(width, height));
        }

        int batchOffset = batchIndex * width * height * channels;

        for (size_t c = 0; c < channels; c++) {
            for (size_t  h = 0; h < height; h++) {
                for (size_t w = 0; w < width; w++) {
                    blob_data[batchOffset + c * width * height + h * width + w] =
                        resized_image.at<cv::Vec3b>(h, w)[c] * scaleFactor;
                }
            }
        }
    }

    void imageCallback(const sensor_msgs::ImageConstPtr& msg)
    {
      cv_bridge::CvImagePtr mImage;
      try 
      {
        mImage = cv_bridge::toCvCopy(msg, sensor_msgs::image_encodings::RGB8);
        
      }
      catch (cv_bridge::Exception& e)
      {
        ROS_ERROR("cv_bridge exception: %s", e.what());
        return;
      }
      printf("Line 67");
      auto infer_request = exec_network.CreateInferRequest();
      const cv::Mat const_image = mImage->image;
      /** Iterate over all input blobs **/
      for (auto & item : input_info) {
          auto input_name = item.first;
          /** Get input blob **/
          InferenceEngine::Blob::Ptr input = infer_request.GetBlob(input_name);
          /** Fill input tensor with planes. First b channel, then g and r channels **/
          matU8ToBlob<uint8_t>(const_image, input, 1.0, 0);
      }
      printf("Line 78");
      infer_request.StartAsync();
      infer_request.Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
      printf("Line 81");
      for (auto &item : output_info) {
        auto output_name = item.first;
        auto output = infer_request.GetBlob(output_name);

        InferenceEngine::SizeVector blobSize = output->getTensorDesc().getDims();
        printf("\n %d %d %d \n", blobSize[3], blobSize[2], blobSize[1]);
        auto const memLocker = output->cbuffer(); // use const memory locker
        // output_buffer is valid as long as the lifetime of memLocker
        const float *output_buffer = memLocker.as<const float *>();
        /** output_buffer[] - accessing output blob data **/
        printf("%d", sizeof(output_buffer));
      }
    }

    YoloV4OpenVINO() : it(mNh)
    {
      ROS_INFO("HELLO");
      image_transport::TransportHints th("compressed");
      sub = it.subscribe("/top_sd_cam/image_raw", 1, &YoloV4OpenVINO::imageCallback, this, th);
      network = core.ReadNetwork(modelXml);
      input_info = network.getInputsInfo();
      output_info = network.getOutputsInfo();
      printf("Line 102");
      for (auto &item : input_info) {
        auto input_data = item.second;
        input_data->setPrecision(InferenceEngine::Precision::U8);
        input_data->setLayout(InferenceEngine::Layout::NHWC);
        input_data->getPreProcess().setResizeAlgorithm(InferenceEngine::RESIZE_BILINEAR);
        input_data->getPreProcess().setColorFormat(InferenceEngine::ColorFormat::RGB);
      }
      printf("Line 110");
      for (auto &item : output_info) {
        auto output_data = item.second;
        output_data->setPrecision(InferenceEngine::Precision::FP32);
        output_data->setLayout(InferenceEngine::Layout::ANY);
      }

      printf("Line 117");
      exec_network = core.LoadNetwork(network, "CPU");
      printf("Line 119");
      ROS_INFO("HELLO2");
    }   
};


int main(int argc, char** argv) {
  ros::init(argc,argv,"yolov4_openvino");
  YoloV4OpenVINO yv4;
  ros::spin();
  return 0;
}

Zulkifli_Intel · ‎03-28-2021

Hello Dominic Mazza,

Thank you for contacting us,

Blob is a bunch of data. In other machine learning frameworks, this is generally called a tensor, or a multi-dimensional array. Please visit Blob Documentation for more information.

Try to add SetBlob() into your .cpp file as it has the capability to perform resizing for the input blob that has been configured as resizeable. As we did not find any usage of SetBlob() in your current code.

Information Sharing

SetBlob() method compares the precision and layout of an input blob with ones defined on step 3 and throws an exception if they do not match. It also compares the size of the input blob with the input size of the read network. But if the input was configured as resizable, you can set an input blob of any size (for example, any ROI blob). Input resize will be invoked automatically using resize algorithm configured on step 3. Similarly to the resize, color format conversions allow the color format of an input blob to differ from the color format of the read network. Color format conversion will be invoked automatically using color format configured on step 3.

Article: https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_Integrate_with_customer_application_new_API.html

Can you share your model and the extracted output size for each blobs that you used? So that we can investigate this issue further.

Regards,

Zulkifli

dominicfmazza · ‎03-30-2021

Thank you so much for your response, here is a .zip with both the xml and bin file from the model optimizer: https://gofile.io/d/vFsU61 (file size is too large to upload direct). Additionally, the network outputs three blobs for inference with size:

13x13x255

26x26x255

52x52x255

The underlying network is a yolov4 trained on 10 classes. I am not sure how you want me to implement SetBlob() in my implmentation, so could you elaborate on that?

Zulkifli_Intel · ‎04-01-2021

Hello Dominic

Pertaining to your concern on the blob output is large, it was due to the size of the .bin file from his respective model which contributes to the .blob file size. As we can see, the .bin file for the model is 250MB, and obtaining a large .blob file from such a model is expected as it contains the weights and biased binary data.

We can compare the .blob file with The Intel pre-trained model (squeezenet) as below whereby the size of .bin and .blob is more or less the same.

Regards,

Zulkifli

Zulkifli_Intel · ‎04-13-2021

Hello Dominic,

This thread will no longer be monitored since we have provided information on the blob. If you need any additional information from Intel, please submit a new question.

Regards,

Zulkifli

Zulkifli_Intel · ‎05-11-2021

Hello Dominic Mazza,

We sincerely apologize for the oversight on our part with regards to your query on how to implement SetBlob().

SetBlob() is used to tell input/output data to infer. SetBlob() can be added if you want to set additional input data, as it has the capability to perform resizing for the input that has been configured as resizable as mentioned earlier in this post. Please take a look at this C++ sample code for reference on how to implement SetBlob().

Regards,

Zulkifli