Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Deprost__Emiel
Beginner
85 Views

Long execution time of reshape operation

Jump to solution

Hello,

I benchmarked the execution speed of ShuffleNet V2 on the Movidius 1 stick. And only obtained about 9 fps, that is slower than MobileNet v2 that achieves 23 fps. I looked at the execution times of the different layers and saw that the reshape operation is taking about 34% of the compute time. The reshape is needed for the shuffle operation that's implemented like this:

def concat_shuffle_split(x, y):
    with tf.name_scope('concat_shuffle_split'):
        shape = tf.shape(x)
        batch_size = shape[0]
        height, width = shape[1], shape[2]
        depth = x.shape[3].value

        z = tf.concat([x, y], axis=3)
        z = tf.reshape(z, [batch_size, height, width, 2, depth])
        z = tf.transpose(z, [0, 1, 2, 4, 3])
        z = tf.reshape(z, [batch_size, height, width, 2*depth])
        x, y = tf.split(z, num_or_size_splits=2, axis=3)
        return x, y

 

Is it normal that the reshape operation takes that long? Does the reshape operation need to move memory blocks around and is it because of that that it's taking a long time?

 

Thanks in advance,

Emiel Deprost

0 Kudos

Accepted Solutions
Shubha_R_Intel
Employee
85 Views

Dearest Deprost, Emiel,

For sure the NCS1 stick is worse performing than NCS2. You are using 5D tensors which causes a lot of re-layering and re-ordering. Here is a transparent answer.

  1. Your model is  a TF model so it defined for NHWC tensors, but IE IR uses NCHW order and therefore MO transforms all the tensors and operations to NCHW.
  2. But then there is a 5D tensor – so it is not NHWC and before processing this one MO should insert a reorder layer to get back to NHWC, then translate 5D part of the graph as is and add layer to convert to NCHW again after 5D piece is done.
  3. Then it goes to IE. And “surprise” since it is version 1 of the NCS stick it will actually compute in NHWC layout. So Myriad plugin will do reverse graph transformation from NCHW IR to internal NHWC representation.
  4. But then it similarly faces 5D tensor. So it is not NHWC or NCHW and plugin will add additional relayout layer to get to IR’s NCHW before reshape and matching relayout after 5D is done.
  5. This produces a long chain of reshapes and relayout that plugin tries to optimize and remove redundant data movements.

 There is a chance that Reshape will perform better if you avoid 5D data.

Hope it helps !

Thanks for using OpenVino,

Shubha

View solution in original post

2 Replies
Severine_H_Intel
Employee
85 Views

Hi Emiel, 

can you confirm that you are using OpenVINO for performing this benchmark?

Best,

Severine

Shubha_R_Intel
Employee
86 Views

Dearest Deprost, Emiel,

For sure the NCS1 stick is worse performing than NCS2. You are using 5D tensors which causes a lot of re-layering and re-ordering. Here is a transparent answer.

  1. Your model is  a TF model so it defined for NHWC tensors, but IE IR uses NCHW order and therefore MO transforms all the tensors and operations to NCHW.
  2. But then there is a 5D tensor – so it is not NHWC and before processing this one MO should insert a reorder layer to get back to NHWC, then translate 5D part of the graph as is and add layer to convert to NCHW again after 5D piece is done.
  3. Then it goes to IE. And “surprise” since it is version 1 of the NCS stick it will actually compute in NHWC layout. So Myriad plugin will do reverse graph transformation from NCHW IR to internal NHWC representation.
  4. But then it similarly faces 5D tensor. So it is not NHWC or NCHW and plugin will add additional relayout layer to get to IR’s NCHW before reshape and matching relayout after 5D is done.
  5. This produces a long chain of reshapes and relayout that plugin tries to optimize and remove redundant data movements.

 There is a chance that Reshape will perform better if you avoid 5D data.

Hope it helps !

Thanks for using OpenVino,

Shubha

View solution in original post

Reply