Set precision of the input/output buffer of a DNN algorithm running on the Intel NCS2

alexis-nicole · ‎01-19-2022

I recently got an Intel Neural Compute Stick 2 (NCS2) and I'm developing a simple app that is able to detect one specific object in an input image.

After browsing for a while, I have found an example on ncappzoo that demonstrates how to detect faces, gender and age (source here). For my application, I'm focusing on the face detection part, and my goal is to modify this example in order to detect another object, for which I have the network already trained and tested. Since I cannot share my code, I'll focus on the gender_age sample.

Details:

I am a little bit confused about how to figure out the precision of the input buffer and the output buffer of the network used in the gender_age example.

The gender_age example uses U8 as precision for the input buffer:

auto faceInputData = faceInput->buffer().as<PrecisionTrait<Precision::U8>::value_type*>();

however, if you open the *.xml file of the face detection network (face-detection-retail-0004.xml):

<net name="cnn_fd_004_sq_light_ssd" version="10">
  <layers>
    <layer id="0" name="data" type="Parameter" version="opset1">
      <data element_type="f16" shape="1, 3, 300, 300"/>
      <output>
        <port id="0" names="data" precision="FP16">
          <dim>1</dim>
          <dim>3</dim>
          <dim>300</dim>
          <dim>300</dim>
        </port>
      </output>
    </layer>
    <layer id="1" name="127" type="Const" version="opset1">
      <data element_type="f16" offset="0" shape="1, 3, 1, 1" size="6"/>
      ...
      ...

you can see that the precision for the input layer is set to FP16. Here I'm assuming that the first layer is the input layer. Is this right?

A similar consideration applies for the output buffer. The sample code sets the precision as FP32 (here

faceOutputInfo->setPrecision(Precision::FP32);

but the *.xml defines the output layer as follows:

...
...
<output>
  <port id="3" names="detection_out" precision="FP16">
    <dim>1</dim>
    <dim>1</dim>
    <dim>200</dim>
    <dim>7</dim>
  </port>
</output>
...
...

where the precision is set to FP16.

My questions are:

Am I looking at the right place to figure out the precision of the input and output buffer?
If yes, why the parameters inside the *.xml file don't match the ones on the gender_age example?
Is there any other way to figure out the precision of the input and output buffer without digging into the *.xml file?

Peh_Intel · ‎01-20-2022

Hi Alexis-Nicole,

Thanks for reaching out to us.

Yes, the name “data” is the input layer of your model, and it is set to FP16 if looking into the .xml file. However, for FP16 precision networks, the input is expected in FP32 by default. You can check using getPrecision() method.

For example:

std::cout << faceInputInfo->getPrecision() << std::endl;

std::cout << faceOutputInfo->getPrecision() << std::endl;

Generally, OpenVINO™ samples and demos manually set the input precision to U8 and output precision to FP32 as they are the most ubiquitous among Inference Engine devices. You can refer to Supported Input Precision and Supported Output Precision.

Regards,

Peh

Peh_Intel · ‎01-27-2022

Hi Alexis-Nicole,

This thread will no longer be monitored since we have provided answers. If you need any additional information from Intel, please submit a new question.

Regards,

Peh