<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Intel VML slow in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801854#M3133</link>
    <description>Hello Sdgkgp,&lt;BR /&gt;&lt;BR /&gt;You seem to use .C function in your R application:&lt;BR /&gt;t &amp;lt;- .C("get_mkl_log", dB = out_vec, Blen = as.integer(N), dA = in_vec, Alen = as.integer(N) ) # actual call&lt;BR /&gt;&lt;BR /&gt;According to Section 5.2 of the document "Writing R extensions" available at &lt;A href="http://cran.r-project.org/doc/manuals/R-exts.html"&gt;http://cran.r-project.org/doc/manuals/R-exts.html&lt;/A&gt;.C function can introduce an additional argument overhead: "Unless formal argument NAOK is true, all the other arguments are checked for missing values NA and for the IEEE special values NaN, Inf and -Inf, and the presence of any of these generates an error." &lt;BR /&gt;&lt;BR /&gt;You might want to try .External or .Call functions as alternative. Hope this would help.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Andrey&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Thu, 24 Feb 2011 10:42:34 GMT</pubDate>
    <dc:creator>Andrey_N_Intel</dc:creator>
    <dc:date>2011-02-24T10:42:34Z</dc:date>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801842#M3121</link>
      <description>Hi, &lt;BR /&gt;&lt;BR /&gt;I am a newbie so please bear with me if I provide irrelevant details.&lt;BR /&gt;&lt;BR /&gt;I am trying to achieve the speeds reported in:&lt;BR /&gt;&lt;A href="http://software.intel.com/sites/products/documentation/hpc/mkl/vml/functions/_performanceall.html" target="_blank"&gt;http://software.intel.com/sites/products/documentation/hpc/mkl/vml/functions/_performanceall.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;for the log function vsLn()&lt;BR /&gt;&lt;BR /&gt;My simple C script containsjust one call to vsLn()&lt;BR /&gt;&lt;BR /&gt;I compile it on windows using:&lt;BR /&gt;&lt;BR /&gt;g++ -I"C:/PROGRA~1/R/R-212~1.1/include" -I"C:/Progra~1/Intel/ComposerXE-2011/mkl&lt;BR /&gt;/include" -O2 -Wall -c MKLvml_main.cc -o MKLvml_main.o&lt;BR /&gt;&lt;BR /&gt;g++ -shared -s -static-libgcc -o MKLvml.dll tmp.def MKLvml_main.o C:/Progra~1/In&lt;BR /&gt;tel/ComposerXE-2011/mkl/lib/ia32/mkl_intel_c_dll.lib C:/Progra~1/Intel/ComposerX&lt;BR /&gt;E-2011/mkl/lib/ia32/mkl_sequential_dll.lib C:/Progra~1/Intel/ComposerXE-2011/mkl&lt;BR /&gt;/lib/ia32/mkl_core_dll.lib C:/Progra~1/Intel/ComposerXE-2011/mkl/lib/ia32/mkl_rt&lt;BR /&gt;.lib -LC:/PROGRA~1/R/R-212~1.1/bin/i386 -lR&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;As you can see I am using sequential library. I also tried parallel and the results are the same.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Can someone please suggest what I can do to improve the speed?&lt;BR /&gt;&lt;BR /&gt;Currently 10^8 log operations (in a loop of 10^3 iterations each computing the log of a 10^5 long vector) takes around 6s. Expected is less than .5s.&lt;BR /&gt;&lt;BR /&gt;( The results I am getting are just 2x improvement over the default log calculation. I am working inside R just FYI.)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 14 Feb 2011 23:06:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801842#M3121</guid>
      <dc:creator>sdgkgp</dc:creator>
      <dc:date>2011-02-14T23:06:34Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801843#M3122</link>
      <description>Hi sdgkgp!&lt;BR /&gt;&lt;BR /&gt;could you provide little bit more details?&lt;BR /&gt;1) your sample program will be helpful for us&lt;BR /&gt;2) at with CPU you are running your sample?&lt;BR /&gt;&lt;BR /&gt;Andrey</description>
      <pubDate>Tue, 15 Feb 2011 06:36:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801843#M3122</guid>
      <dc:creator>Andrey_G_Intel2</dc:creator>
      <dc:date>2011-02-15T06:36:16Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801844#M3123</link>
      <description>We also need to know the exact version of mkl you are using. Could please let us know the Package ID?&lt;DIV&gt;You can find it in the mklsupport.txt file ( &lt;COMPOSER_XE_INSTALL_DIR&gt;\Documentation\ )&lt;/COMPOSER_XE_INSTALL_DIR&gt;&lt;/DIV&gt;&lt;DIV&gt;--Gennady&lt;/DIV&gt;</description>
      <pubDate>Tue, 15 Feb 2011 06:59:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801844#M3123</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2011-02-15T06:59:28Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801845#M3124</link>
      <description>Hi Andrey,&lt;BR /&gt;&lt;BR /&gt;1) My C code is as follows:&lt;BR /&gt;&lt;BR /&gt;#include &lt;STDIO.H&gt;&lt;BR /&gt;#include "R.h"&lt;BR /&gt;#include "Rmath.h"&lt;BR /&gt;#include "mkl_vml.h"&lt;BR /&gt;#include "mkl.h"&lt;BR /&gt;&lt;BR /&gt;extern "C" {&lt;BR /&gt;&lt;BR /&gt;void get_mkl_log(float *fB, int *Blen, float *fA, int *Alen){&lt;BR /&gt;&lt;BR /&gt;vmlSetMode(VML_EP);&lt;BR /&gt;MKL_INT vec_len = Alen[0];&lt;BR /&gt;vsLn(vec_len, fA, fB);&lt;BR /&gt;&lt;BR /&gt;return;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;As you can see there are some R header files which are for enabling R to talk with C++&lt;BR /&gt;&lt;BR /&gt;2) I am using Intel Core 2 Quad CPU Q9400 @ 2.66GHz&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Please let me know what more details I can provide.&lt;/STDIO.H&gt;</description>
      <pubDate>Tue, 15 Feb 2011 16:15:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801845#M3124</guid>
      <dc:creator>sdgkgp</dc:creator>
      <dc:date>2011-02-15T16:15:36Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801846#M3125</link>
      <description>Hello Gennady,&lt;BR /&gt;&lt;BR /&gt;It is &lt;BR /&gt;&lt;BR /&gt;Package ID: w_mkl_10.3.2.154 w_ccompxe_2011.2.154 w_fcompxe_2011.2.154&lt;BR /&gt;&lt;BR /&gt;Thanks again for looking into this. Looking forward to your reply.</description>
      <pubDate>Tue, 15 Feb 2011 16:17:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801846#M3125</guid>
      <dc:creator>sdgkgp</dc:creator>
      <dc:date>2011-02-15T16:17:13Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801847#M3126</link>
      <description>&lt;P&gt;sdgkgp,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;could you provide full example? It will help us to give exact and quick answer. We also need to know how you fill input vector, how you are doing performance measurements and etc.&lt;BR /&gt;&lt;BR /&gt;Andrey&lt;/P&gt;</description>
      <pubDate>Tue, 15 Feb 2011 16:44:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801847#M3126</guid>
      <dc:creator>Andrey_G_Intel2</dc:creator>
      <dc:date>2011-02-15T16:44:36Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801848#M3127</link>
      <description>As I mentioned, this is done inside R:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;dyn.load("C:/RPackages/MKLvml/src/MKLvml.dll")&lt;BR /&gt;N = 1e3&lt;BR /&gt;in_vec = as.single( runif(N) ) # generates random uniform numbers between 0 and 1&lt;BR /&gt;out_vec = as.single( vector("numeric",N) ) # allocated mempry to out_vec&lt;BR /&gt;&lt;BR /&gt;system.time( # for performance measurement (time taken)&lt;BR /&gt;for (i in 1:1e5) &lt;BR /&gt; { &lt;BR /&gt; t &amp;lt;- .C("get_mkl_log", dB = out_vec, Blen = as.integer(N), dA = in_vec, Alen = as.integer(N) ) # actual call&lt;BR /&gt; }&lt;BR /&gt;)&lt;BR /&gt;&lt;BR /&gt;The output I get is:&lt;BR /&gt;&lt;BR /&gt;user system elapsed &lt;BR /&gt; 4.53 0.00 4.54&lt;BR /&gt;&lt;BR /&gt;which means 4.54s were taken by the core process.&lt;BR /&gt;</description>
      <pubDate>Tue, 15 Feb 2011 16:52:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801848#M3127</guid>
      <dc:creator>sdgkgp</dc:creator>
      <dc:date>2011-02-15T16:52:46Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801849#M3128</link>
      <description>The example above shows that the computation is being done at:&lt;BR /&gt;&lt;BR /&gt;&lt;TABLE class="std"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;/TD&gt;&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;1e8 * 3.01 / 4.54 = .066 Ghz&lt;BR /&gt;&lt;BR /&gt;while my CPU is 2.66 Ghz&lt;BR /&gt;&lt;BR /&gt;( 1e8 log operations each consuming 3.01 cycles as given in the performance docs for vsLn in EP mode )</description>
      <pubDate>Tue, 15 Feb 2011 16:57:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801849#M3128</guid>
      <dc:creator>sdgkgp</dc:creator>
      <dc:date>2011-02-15T16:57:43Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801850#M3129</link>
      <description>We will try to reproduce your situation. But I can say right now, that you measured not vsLn performance only. You measured overheads for MKL dlls loading, call to vmlSetMode and maybe some other overheads were included to your measurements.&lt;BR /&gt;&lt;BR /&gt;Andrey&lt;BR /&gt;</description>
      <pubDate>Wed, 16 Feb 2011 08:27:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801850#M3129</guid>
      <dc:creator>Andrey_G_Intel2</dc:creator>
      <dc:date>2011-02-16T08:27:56Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801851#M3130</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;Your linking line:&lt;BR /&gt;&lt;BR /&gt;g++ -shared -s -static-libgcc -o MKLvml.dll tmp.def MKLvml_main.o C:/Progra~1/In&lt;BR /&gt;tel/ComposerXE-2011/mkl/lib/ia32/mkl_intel_c_dll.lib C:/Progra~1/Intel/ComposerX&lt;BR /&gt;E-2011/mkl/lib/ia32/&lt;STRONG&gt;mkl_sequential_dll.lib&lt;/STRONG&gt; C:/Progra~1/Intel/ComposerXE-2011/mkl&lt;BR /&gt;/lib/ia32/mkl_core_dll.lib C:/Progra~1/Intel/ComposerXE-2011/mkl/lib/ia32/&lt;STRONG&gt;mkl_rt&lt;BR /&gt;&lt;/STRONG&gt;.lib -LC:/PROGRA~1/R/R-212~1.1/bin/i386 -lR&lt;BR /&gt;&lt;BR /&gt;used sequential library together with mkl_rt :( You'd beter use one linking model.&lt;BR /&gt;Please try &lt;A href="http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/"&gt;MKL Link Line Advisor&lt;BR /&gt;&lt;/A&gt;&lt;BR /&gt;But for eliminating overhead on loading dynamic libraies please use only static libraies if possible.</description>
      <pubDate>Wed, 16 Feb 2011 09:10:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801851#M3130</guid>
      <dc:creator>barragan_villanueva_</dc:creator>
      <dc:date>2011-02-16T09:10:09Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801852#M3131</link>
      <description>Thank you everyone for your answers.&lt;BR /&gt;&lt;BR /&gt;I was making mistake in performance evaluation.&lt;BR /&gt;&lt;BR /&gt;It turns out R has a lot of overhead when communicating data to C and that is why it is so slow.&lt;BR /&gt;&lt;BR /&gt;When I compute the timing from inside C, the numbers match with those reported in the performance docs.</description>
      <pubDate>Wed, 16 Feb 2011 22:13:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801852#M3131</guid>
      <dc:creator>sdgkgp</dc:creator>
      <dc:date>2011-02-16T22:13:01Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801853#M3132</link>
      <description>Hi sdgkgp,&lt;BR /&gt;&lt;BR /&gt;It still makes sense to understand why R calling overhead of the third-party DLL is so big. We will experiment on our side and report back. If you also havesome interestingfindings on your side, we will be happy if you let us know about those.&lt;BR /&gt;&lt;BR /&gt;Many thanks for your interest,&lt;BR /&gt;Sergey</description>
      <pubDate>Thu, 17 Feb 2011 07:48:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801853#M3132</guid>
      <dc:creator>Sergey_M_Intel2</dc:creator>
      <dc:date>2011-02-17T07:48:16Z</dc:date>
    </item>
    <item>
      <title>Intel VML slow</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801854#M3133</link>
      <description>Hello Sdgkgp,&lt;BR /&gt;&lt;BR /&gt;You seem to use .C function in your R application:&lt;BR /&gt;t &amp;lt;- .C("get_mkl_log", dB = out_vec, Blen = as.integer(N), dA = in_vec, Alen = as.integer(N) ) # actual call&lt;BR /&gt;&lt;BR /&gt;According to Section 5.2 of the document "Writing R extensions" available at &lt;A href="http://cran.r-project.org/doc/manuals/R-exts.html"&gt;http://cran.r-project.org/doc/manuals/R-exts.html&lt;/A&gt;.C function can introduce an additional argument overhead: "Unless formal argument NAOK is true, all the other arguments are checked for missing values NA and for the IEEE special values NaN, Inf and -Inf, and the presence of any of these generates an error." &lt;BR /&gt;&lt;BR /&gt;You might want to try .External or .Call functions as alternative. Hope this would help.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Andrey&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 24 Feb 2011 10:42:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-VML-slow/m-p/801854#M3133</guid>
      <dc:creator>Andrey_N_Intel</dc:creator>
      <dc:date>2011-02-24T10:42:34Z</dc:date>
    </item>
  </channel>
</rss>

