Inference time higher on HDDL than CPU!

ravi31 · ‎10-03-2021

* I had trained an SSD Mobilenet V1 FPN using Tensorflow 1.15.2 via the Tensorflow object detection API.

* After that, I tried to do inferencing inside my own system (8GB RAM, i5 7th gen, 2.6 GHz x 4, Intel HD Graphics 620) with device mode = 'CPU'. This was done inside EIS 2.3.4 with openVino 2020.4

* It was taking on average 1.6025 seconds to process every frame on that device.

=========================================================================

* Then, I tried to run it inside an Intel(R) Xeon(R) W-2295 CPU @3GHz with 128 GB RAM.

It was taking

1) 0.11212 seconds per frame with mode = 'CPU'.

2) 1.35568 seconds per frame with mode = 'HDDL' . I have set up a Movidius VPU and the HDDL daemon is running.

This behaviour is quite surprising as I'm doing inferencing for one Camera stream and still, HDDL is taking more time than CPU.

ravi31 · ‎10-03-2021

I forgot to mention that I had converted the model to the OpenVino IR format inside OpenVino 2020.4 and then I was doing the inference stuff.

PFA the logs and screenshots that convey the same problem.

Peh_Intel · ‎10-04-2021

Hi Ravi,

Thanks for sharing your inference results with us.

OpenVINO™ Toolkit includes Benchmark Tool which allows user to estimate the model’s inference performance on supported devices in synchronous and asynchronous modes. Besides, we do have Benchmark results as references on some selected models and their comparison between different devices.

Based on the comparison graphs in the Benchmark results, we can observe that Intel® Vision Accelerator Design with 8 Intel® Movidius™ VPUs (MUSTANG-V100-MX8) always have higher throughput (FPS) than INTEL® CORE™ I5-8500 (6 cores, 6 threads) and INTEL® XEON® W1290P (10 cores, 20 threads) but lower than INTEL® XEON® SILVER 4216R (16 cores, 32 threads) and INTEL® XEON® GOLD 5218T (16 cores, 32 threads).

From your shared information about your device: 7th generation i5 Processor (4 cores, 4 threads), Intel(R) Xeon(R) W-2295 (18 cores, 36 threads) and a Movidius VPU. Can you also share the information of your Intel® Vision Accelerator Design (HDDL plugin)? Is your Intel® Vision Accelerator Design with 1,2, 4 or 8 Intel® Movidius™ VPU because you mentioned ‘a’ Movidius VPU? Please correct me if I misinterpreted your device info.

Please also try to do inferencing with the Benchmark Tool by just specifying model and device while the rest of the parameters (number of streams, threads and infer requests) are determined automatically based on the selected device. Please take note that Benchmark Tool only supports image or a folder of images as the input.

Regards,

Peh

Peh_Intel · ‎10-14-2021

Hi Ravi,

Thank you for your question. This thread will no longer be monitored since we have provided answers. If you need any additional information from Intel, please submit a new question.

Regards,

Peh