Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Community Manager
251 Views

softmax layer, 60 ms on 0.01 MFLOPS

Hello Movis

 

I have a very simple Fully Convolutional network (converting a input of shape [1, 2, 256, 256] to a pmap of [1 2 42 42]).

 

It uses 3x3 and 5x5 layers and finishes off with a Softmax layer. The softmax layer recieves a map of shape [1 2 42 42].

 

The network, confirmed and working on the Movidius NC, has < 1GFlops and I profile it with

 

mvNCProfile model.prototxt -w model.caffemodel -s 12

 

The total inference time is reported to be 132ms with 58ms used in the last layer doing a softmax of complexity 0.010584 MFLOPS!

 

This cant be true .. I guess I'am missing some parameter?

 

Thanks! And have a Nice one

 

/B

 

 

A minimal prototxt example which (on my computer) confirms this is:

 

(I installed all movidius (ncsdk) on a clean ubuntu system with mvNCProfile --version v02.00.)

 

name: "CNN_test_movidius_softmax" input: "data" input_shape { dim: 1 dim: 2 dim: 42 dim: 42 } layer { name: "prob" type: "Softmax" bottom: "data" top: "prob" }

 

A call to mvNCProfile test_softmax.prototxt -s 12 results in:

 

Detailed Per Layer Profile Bandwidth time # Name MFLOPs (MB/s) (ms) =============================================================================== 0 input 0.0 0.0 0.002 1 prob 0.0 0.1 58.205 ------------------------------------------------------------------------------- Total inference time 58.21 -------------------------------------------------------------------------------
Tags (1)
0 Kudos
2 Replies
Highlighted
Community Manager
21 Views

Solution; I skip the Softmax layer and calculate it in the program.

 

It is NOT a solution but a hack.

 

My hack needs some magic scaling to be identical to the 'true' solution.

 

Hack:

 

     

  • Return the layer before the Softmax (just edit the .prototxt and remove the 'prob' layer)

  •  

  • Calc the softmax (THIS IS A WRONG VERSION, it does not scale right!)

  •  

  • Save approx 50ms

     

    for (Mat &M : probability_maps )

     

    {

     

    // max of each dimension

     

    double minval, maxval;

     

    cv::minMaxIdx(M, &minval, &maxval);

     

    // Ei = exp of each Pi-maxPi

     

    M -= maxval;

     

    cv::exp(M, M);

     

    // sum of each exp(Pi-maxPi)

     

    const Scalar s = cv::sum(M);

     

    // divide each element of Ei by sum of Ei

     

    const double scale = 1.0 / (s.val[0] + 1.e-6);

     

    M *= scale;

     

    }
  •  

 

 

Have a Good Day!

 

/B

 

NB: It would be nice if anyone could confirm that my use of the softmax in NCSDK is correct. I know that for two classes (ie binary) I could use a Sigmoid layer but my network _will_ have more than two classes when its up and running.

0 Kudos
Highlighted
Community Manager
21 Views

// previous post contains some code. That code is .. like .. NOT .. correct. (do not know on what data I verified that .. but .. SORRY if you copy pasted and ran bevildered ..)

 

An, even more, correct version of a softmax layer ..

 

for (cv::Mat &M : probability_maps ) { // M -= max; cv::exp(M,M); } Mat Sum = probability_maps[0].clone(); for (unsigned int i = 1; i < probability_maps.size(); i++ ) { Sum += probability_maps[i]; } // divide each pix with sum .. ie scale prob to 0:1 for (cv::Mat &M : probability_maps ) { cv::divide(M, Sum, M); }

 

.. I still havent found out WHY movidius cant handle a softmax on a eg 50x50 2 channels layer.

0 Kudos