- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
I've tested some networks with Intel NCS 2 (Myriad X VPU) and a bit disappointed with perfomance results. My setup is Ubuntu 20.4 and OpenVino 2021.3.
Please answer is this relevant perfomance or not?
- Default squeezenet1.1 network test.
Command:
benchmark_app -d MYRIAD -i car.png -m squeezenet1.1.xml -pc -niter 1000 -nireq 4
Results:
Count: 1000 iterations
Duration: 3542.36 ms
Latency: 14.07 ms
Throughput: 282.30 FPS - person-detection-0202 network test
Command:
benchmark_app -d MYRIAD -i 2person.png -m person-detection-0202.xml -pc -niter 1000 -nireq 4
Results:
Count: 1000 iterations
Duration: 54655.46 ms
Latency: 218.30 ms
Throughput: 18.30 FPS - person-vehicle-bike-detection-crossroad-yolov3-1020 network test.
Command:
benchmark_app -d MYRIAD -i 2person.png -m person-vehicle-bike-detection-crossroad-yolov3-1020.xml -pc -niter 1000 -nireq 4
Results:
Count: 1000 iterations
Duration: 230249.25 ms
Latency: 805.38 ms
Throughput: 4.34 FPS
Complexity of person-detection-0202 network is 3.143 Gflops.
Complexity of person-vehicle-bike-detection-crossroad-yolov3-1020 is 65.984 Gflops
How is this happened that perfomance of theese networks is 4,2 times difference while complexity is more than 20 times difference?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello daemonserj,
These performances are expected. I also get similar performances of these networks as well.
A higher number of FLOPs is directly proportional to a reduced number of FPS. However, there is no fair comparison between the FPS and FLOPs. The throughput (FPS) from the Benchmark Tool is measured by the number of inferences delivered within a latency threshold. And the latency is measured by the synchronous execution of inference requests.
Besides, there are also many factors that can affect the FPS obtained from the Benchmark Tool, i.e. run in synchronous and asynchronous mode, number of streams, number of inference requests.
Furthermore, Intel® NCS2 is built on the Intel® Movidius™ Myriad™ X VPU featuring 16 programmable shave cores and a dedicated neural compute engine. This feature will also boost up the performance of the networks.
Regards,
Peh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi daemonserj,
This thread will no longer be monitored since we have provided the answers. If you need any additional information from Intel, please submit a new question.
Regards,
Peh

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page