Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6300 Discussions

Only 50% CPU usage during the async inference



I´m trying to infer a video using a custom python app based on the action recognition sample. Also, I´m using a custom model on this. However, the CPU usage is always less than 50% on each core when the CPU device is selected. I suppose that it´s not an accurate performance. The report produced by the app shows these values:

  • Data total: 10.06ms (+/-: 1.54) 99.41fps
  • Data own: 10.04ms (+/-: 1.55) 99.59fps
  • Data-Model total: 0.68ms (+/-: 0.12) 1471.27fps
  • Data-Model   own: 0.67ms (+/-: 0.12) 1487.45fps
  • Model total: 0.63ms (+/-: 0.24) 1581.32fps
  • Model  own: 0.24ms (+/-: 0.23) 4160.17fps
  • Render total: 21.93ms (+/-: 1.89) 45.60fps
  • Render own: 21.92ms (+/-: 1.89) 45.62fps

My specs are:

  • CPU: Xeon(R) CPU E3-1225 v3 @ 3.20GHz
  • RAM: 32GB
  • OS: Ubuntu 18.04 
  • OpenVINO version: 2020.2.210

So, if you have any idea about what is happening or any aspect which can be optimized. Moreover, I´ve attached the app and model used

Thanks for reading!

0 Kudos
3 Replies

Hi Adrian,

How do you measure the CPU usage?

For better understanding about the CPU usage, you may use Intel Vtune Profiler.


Ram prasad  

0 Kudos

Hi Ram Prasad,

Thank you for your quick response!

I´ve followed your advice using vTune to measure my CPU usage and the export command. Nonetheless, the performance that I´ve got is not good in terms of CPU usage (24,2%) and Memory Bound (100% - 63.5 Cache Bound & 15.1 DRAM Bound). This time, I´ve used another python script based on the benchmark app available in the toolkit (attached bellow). I´ve also left you the vTune profile report attached to this post.

The purpose of this testing is to measure the enhancements of the inference time using this toolkit in comparison to Tensorflow. As far as I know, the tool provided by Tensorflow for inference is called Tensorflow Serving. Therefore, I would like to know how do you get the performance from this tool (or another approach that you are using) in order to see the improvements using OpenVINO against this one?


PS: I´m using the same model that I attached before.

Thanks in advance.

Regards, Adrian.


0 Kudos

Hi Adrian.

When running async inference in OpenVINO toolkit, the targeted performance value here is the actual throughput (as opposed to latency value in sync mode), so it's the number of inferences delivered (e.g. FPS). There are no benchmarking results available for such performance value as just % of CPU usage.

If you are targeting throughput results, then we would recommend you to try Benchmark C++ Tool or Benchmark Python Tool in async mode with your custom model. If you take CPU plugin, so it should optimize the number of parallel and queued inferences for CPU device based on the number of CPU cores. However, you can manipulate different parameters (e.g. -nstreams) to find the best approach for you - please see more details in Performance Topics

You can check OpenVINO inference results for various performance values (throughput, value, efficiency, total benefit) for different DL models and devices here

Also, you can find some comparison example of OpenVINO throughput performance with 3rd party products in this article

Best regards, Max.

0 Kudos