Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® Distribution of OpenVINO™ Toolkit
- FLOPS of Myriad X VPU

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
##

chen__bruce

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-18-2019
02:24 AM

167 Views

FLOPS of Myriad X VPU

Hi Sirs,

In the Myriad X VPU product brief, "Myriad X VPU is capable of delivering a total performance of over **4 trillion operations** **per second** (TOPS)"

in another page, it also says "Over** 1 trillion operations per second** of DNN inferencing performance".

I also tested OpenVINO pre-trained models, and have below performance result.

face-detection-retail-0004 (Complexity 1.067 GFLOPs) takes 20ms for one image. 1.067/0.02=53 => **53 GFLOPs**

human-pose-estimation-0001 (Complexity 15.435 GFLOPs) takes 200 ms for one image. 15.535/0.2=78 => **78 GFLOPs**

Why the product brief says the performance is 4TOPS and 1TOPS. But the result of real tests is no more than 100 GFLOPs.

And could you teach me the correct evaluation method of the performance (FLOPS)?

Thank you,

Bruce

2 Replies

Highlighted
##

Shubha_R_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2019
08:12 AM

167 Views

Dear Bruce,

- 4 TOPS is total compute capacity of all ALUs on SoC. That includes fixed function blocks like stereo depth that are not applicable for Neural Networks at all. So 1 TOP is closer to real compute power but is still a coarsely rounded number.
- Single inference task has access to only half of device. To get max utilization you should run >2 concurrent inference tasks on the same device.
- Given your perf numbers, your measurement surely includes all data transfer overhead.
- The GFLOPS numbers for network complexity which you show do not include all computations required to perform inference. Those are basically only convolutions, but there are other operations that require smaller compute but those are instead memory bound so they have non-trivial contribution to inference wall time.

All in all Bruce, if you improve your measurement accuracy, you should get 2-2.5 times higher numbers. I agree that some of the Myriad-X VPU documentation assumes a lot and does not give you the detail which I have just given you.

Thanks for using OpenVino !

Shubha

Highlighted
##

Blanck__Simon

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-22-2020
05:53 AM

167 Views

Hello Bruce,

can you be so kind and tell me how you measured the FLOPS. I want to do the same.

Kind regards,

Simon

For more complete information about compiler optimizations, see our Optimization Notice.