<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: softmax layer, 60 ms on 0.01 MFLOPS in Intel® Distribution of OpenVINO™ Toolkit</title>
    <link>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665910#M3535</link>
    <description>&lt;P&gt;Solution; I skip the Softmax layer and calculate it in the program. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It is NOT a solution but a hack. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My hack needs some magic scaling to be identical to the 'true' solution. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hack: &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI&gt;&lt;P&gt;Return the layer before the Softmax (just edit the .prototxt and remove the 'prob' layer)&lt;/P&gt;&lt;/LI&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI&gt;&lt;P&gt;Calc the softmax (THIS IS A WRONG VERSION, it does not scale right!)&lt;/P&gt;&lt;/LI&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI&gt;&lt;P&gt;Save approx 50ms&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;for (Mat &amp;amp;M : probability_maps )&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;        {&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // max of each dimension&lt;P&gt;&amp;nbsp;&lt;/P&gt;            double minval, maxval;&lt;P&gt;&amp;nbsp;&lt;/P&gt;            cv::minMaxIdx(M, &amp;amp;minval, &amp;amp;maxval);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // Ei = exp of each Pi-maxPi&lt;P&gt;&amp;nbsp;&lt;/P&gt;            M -= maxval;&lt;P&gt;&amp;nbsp;&lt;/P&gt;            cv::exp(M, M);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // sum of each exp(Pi-maxPi)&lt;P&gt;&amp;nbsp;&lt;/P&gt;            const Scalar s = cv::sum(M);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // divide each element of Ei by sum of Ei&lt;P&gt;&amp;nbsp;&lt;/P&gt;            const double scale = 1.0 / (s.val[0] + 1.e-6);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            M *= scale;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;CODE&gt;    }
&lt;/CODE&gt;&lt;/LI&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Have a Good Day! &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;/B&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NB: It would be nice if anyone could confirm that my use of the softmax in NCSDK is correct. I know that for two classes (ie binary) I could use a Sigmoid layer but my network _will_ have more than two classes when its up and running.  &lt;/P&gt;</description>
    <pubDate>Thu, 25 Oct 2018 16:50:40 GMT</pubDate>
    <dc:creator>idata</dc:creator>
    <dc:date>2018-10-25T16:50:40Z</dc:date>
    <item>
      <title>softmax layer, 60 ms on 0.01 MFLOPS</title>
      <link>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665909#M3534</link>
      <description>&lt;P&gt;Hello Movis&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a very simple Fully Convolutional network (converting a input of shape [1, 2, 256, 256] to a pmap of [1 2 42 42]). &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;It uses 3x3 and 5x5 layers and finishes off with a Softmax layer. The softmax layer recieves a map of shape [1 2 42 42].&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The network, confirmed and working on the Movidius NC, has &amp;lt; 1GFlops and I profile it with&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;CODE&gt;mvNCProfile model.prototxt -w model.caffemodel  -s 12
&lt;/CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The total inference time is reported to be 132ms with 58ms used in the last layer doing a softmax of complexity 0.010584 MFLOPS! &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;This cant be true .. I guess I'am missing some parameter?&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks! And have a Nice one&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;/B&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A minimal prototxt example which (on my computer) confirms this is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;(I installed all movidius (ncsdk) on a clean ubuntu system with mvNCProfile --version v02.00.)&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;CODE&gt;    name: "CNN_test_movidius_softmax"
    input: "data"
    input_shape {
      dim: 1
      dim: 2
      dim: 42
      dim: 42
    }        
    layer {
      name: "prob"
      type: "Softmax"
      bottom: "data"
      top: "prob"
    }
&lt;/CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A call to mvNCProfile test_softmax.prototxt -s 12 results in:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;CODE&gt;Detailed Per Layer Profile
                                                               Bandwidth   time
#   Name                                                 MFLOPs  (MB/s)    (ms)
===============================================================================
0   input                                                   0.0     0.0   0.002
1   prob                                                    0.0     0.1  58.205
-------------------------------------------------------------------------------
                                   Total inference time                   58.21
-------------------------------------------------------------------------------
&lt;/CODE&gt;</description>
      <pubDate>Tue, 23 Oct 2018 22:49:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665909#M3534</guid>
      <dc:creator>idata</dc:creator>
      <dc:date>2018-10-23T22:49:27Z</dc:date>
    </item>
    <item>
      <title>Re: softmax layer, 60 ms on 0.01 MFLOPS</title>
      <link>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665910#M3535</link>
      <description>&lt;P&gt;Solution; I skip the Softmax layer and calculate it in the program. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It is NOT a solution but a hack. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My hack needs some magic scaling to be identical to the 'true' solution. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hack: &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI&gt;&lt;P&gt;Return the layer before the Softmax (just edit the .prototxt and remove the 'prob' layer)&lt;/P&gt;&lt;/LI&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI&gt;&lt;P&gt;Calc the softmax (THIS IS A WRONG VERSION, it does not scale right!)&lt;/P&gt;&lt;/LI&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI&gt;&lt;P&gt;Save approx 50ms&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;for (Mat &amp;amp;M : probability_maps )&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;        {&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // max of each dimension&lt;P&gt;&amp;nbsp;&lt;/P&gt;            double minval, maxval;&lt;P&gt;&amp;nbsp;&lt;/P&gt;            cv::minMaxIdx(M, &amp;amp;minval, &amp;amp;maxval);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // Ei = exp of each Pi-maxPi&lt;P&gt;&amp;nbsp;&lt;/P&gt;            M -= maxval;&lt;P&gt;&amp;nbsp;&lt;/P&gt;            cv::exp(M, M);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // sum of each exp(Pi-maxPi)&lt;P&gt;&amp;nbsp;&lt;/P&gt;            const Scalar s = cv::sum(M);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            // divide each element of Ei by sum of Ei&lt;P&gt;&amp;nbsp;&lt;/P&gt;            const double scale = 1.0 / (s.val[0] + 1.e-6);&lt;P&gt;&amp;nbsp;&lt;/P&gt;            M *= scale;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;CODE&gt;    }
&lt;/CODE&gt;&lt;/LI&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Have a Good Day! &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;/B&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NB: It would be nice if anyone could confirm that my use of the softmax in NCSDK is correct. I know that for two classes (ie binary) I could use a Sigmoid layer but my network _will_ have more than two classes when its up and running.  &lt;/P&gt;</description>
      <pubDate>Thu, 25 Oct 2018 16:50:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665910#M3535</guid>
      <dc:creator>idata</dc:creator>
      <dc:date>2018-10-25T16:50:40Z</dc:date>
    </item>
    <item>
      <title>Re: softmax layer, 60 ms on 0.01 MFLOPS</title>
      <link>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665911#M3536</link>
      <description>&lt;P&gt;// previous post contains some code. That code is .. like .. NOT .. correct. (do not know on what data I verified that .. but .. SORRY if you copy pasted and ran bevildered ..)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;An, even more, correct version of a softmax layer .. &lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;CODE&gt;for (cv::Mat &amp;amp;M : probability_maps )
    {
        // M -= max;
      cv::exp(M,M);
    }

  Mat Sum = probability_maps[0].clone();   
  for (unsigned int i = 1; i &amp;lt; probability_maps.size(); i++ )
    {
      Sum += probability_maps[i];
    }

  // divide each pix with sum .. ie scale prob to 0:1 
  for (cv::Mat &amp;amp;M : probability_maps )
    {
      cv::divide(M, Sum, M);
    }
&lt;/CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;.. I still havent found out WHY movidius cant handle a softmax on a eg 50x50 2 channels layer.   &lt;/P&gt;</description>
      <pubDate>Mon, 22 Apr 2019 17:19:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/softmax-layer-60-ms-on-0-01-MFLOPS/m-p/665911#M3536</guid>
      <dc:creator>idata</dc:creator>
      <dc:date>2019-04-22T17:19:37Z</dc:date>
    </item>
  </channel>
</rss>

