Some states of GoogLeNet:
Detailed Per Layer Profile Layer Name MFLOPs Bandwidth MB/s time(ms) ======================================================================================== 0 conv1/7x7_s2 236.028 2505.00 5.63 1 pool1/3x3_s2 1.806 1441.66 1.06 2 pool1/norm1 0.000 712.67 0.54 3 conv2/3x3_reduce 25.690 404.11 0.97 4 conv2/3x3 693.633 316.67 11.55 5 conv2/norm2 0.000 797.05 1.44
So how to find the bottlenecks of the network from the states above?
And I notice the "Size Limitations" with caffe:
Compiled Movidius “graph” file < 320 MB; Intermediate layer buffer size < 100 MB Scratch Memory size < 112 KB
So how to compute the Intermediate layer buffer size and Scratch Memory size ? How to make sure that my network is consonant with the "Size Limitations" ?
@z_huabao In mvNCProfile's output you can see the processing time and processing bandwidth of each layer. Using this information, you can then tune your network to find a balance of speed and accuracy.
Regarding the size limitations, you can compute the intermediate layer buffer size using Caffe and dividing the values by 2 since we are using fp16, however this is very tedious. The NCSDK will warn you if your model exceeds these size limits, so you don't have to worry if that is what you are concerned about.
@z_huabao It depends on what you are trying to do. Larger MFLOPS and bandwidth numbers aren't necessarily bad but if performance is a concern, these numbers could give a hint as to where the slowdowns are.