- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I benchmarked the execution speed of ShuffleNet V2 on the Movidius 1 stick. And only obtained about 9 fps, that is slower than MobileNet v2 that achieves 23 fps. I looked at the execution times of the different layers and saw that the reshape operation is taking about 34% of the compute time. The reshape is needed for the shuffle operation that's implemented like this:
def concat_shuffle_split(x, y): with tf.name_scope('concat_shuffle_split'): shape = tf.shape(x) batch_size = shape[0] height, width = shape[1], shape[2] depth = x.shape[3].value z = tf.concat([x, y], axis=3) z = tf.reshape(z, [batch_size, height, width, 2, depth]) z = tf.transpose(z, [0, 1, 2, 4, 3]) z = tf.reshape(z, [batch_size, height, width, 2*depth]) x, y = tf.split(z, num_or_size_splits=2, axis=3) return x, y
Is it normal that the reshape operation takes that long? Does the reshape operation need to move memory blocks around and is it because of that that it's taking a long time?
Thanks in advance,
Emiel Deprost
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dearest Deprost, Emiel,
For sure the NCS1 stick is worse performing than NCS2. You are using 5D tensors which causes a lot of re-layering and re-ordering. Here is a transparent answer.
- Your model is a TF model so it defined for NHWC tensors, but IE IR uses NCHW order and therefore MO transforms all the tensors and operations to NCHW.
- But then there is a 5D tensor – so it is not NHWC and before processing this one MO should insert a reorder layer to get back to NHWC, then translate 5D part of the graph as is and add layer to convert to NCHW again after 5D piece is done.
- Then it goes to IE. And “surprise” since it is version 1 of the NCS stick it will actually compute in NHWC layout. So Myriad plugin will do reverse graph transformation from NCHW IR to internal NHWC representation.
- But then it similarly faces 5D tensor. So it is not NHWC or NCHW and plugin will add additional relayout layer to get to IR’s NCHW before reshape and matching relayout after 5D is done.
- This produces a long chain of reshapes and relayout that plugin tries to optimize and remove redundant data movements.
There is a chance that Reshape will perform better if you avoid 5D data.
Hope it helps !
Thanks for using OpenVino,
Shubha
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Emiel,
can you confirm that you are using OpenVINO for performing this benchmark?
Best,
Severine
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dearest Deprost, Emiel,
For sure the NCS1 stick is worse performing than NCS2. You are using 5D tensors which causes a lot of re-layering and re-ordering. Here is a transparent answer.
- Your model is a TF model so it defined for NHWC tensors, but IE IR uses NCHW order and therefore MO transforms all the tensors and operations to NCHW.
- But then there is a 5D tensor – so it is not NHWC and before processing this one MO should insert a reorder layer to get back to NHWC, then translate 5D part of the graph as is and add layer to convert to NCHW again after 5D piece is done.
- Then it goes to IE. And “surprise” since it is version 1 of the NCS stick it will actually compute in NHWC layout. So Myriad plugin will do reverse graph transformation from NCHW IR to internal NHWC representation.
- But then it similarly faces 5D tensor. So it is not NHWC or NCHW and plugin will add additional relayout layer to get to IR’s NCHW before reshape and matching relayout after 5D is done.
- This produces a long chain of reshapes and relayout that plugin tries to optimize and remove redundant data movements.
There is a chance that Reshape will perform better if you avoid 5D data.
Hope it helps !
Thanks for using OpenVino,
Shubha
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page