<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MKL can't get any scaling in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937108#M14187</link>
    <description>What settings are you using for OMP_NUM_THREADS and KMP_SERIAL?&lt;BR /&gt;Are you asking all the threads you created to share the same memory regions, and asking MKL to create as many additional threads as possible?</description>
    <pubDate>Tue, 04 Apr 2006 03:52:41 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2006-04-04T03:52:41Z</dc:date>
    <item>
      <title>MKL can't get any scaling</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937106#M14185</link>
      <description>#include iostream&lt;BR /&gt;#include omp.h&lt;BR /&gt;#include "mkl.h"&lt;BR /&gt;&lt;BR /&gt;int main(){&lt;BR /&gt;int len = 1500;&lt;BR /&gt;double* m1;&lt;BR /&gt;double* m2;&lt;BR /&gt;double* m3;&lt;BR /&gt;double t0, tf, tm1, time;&lt;BR /&gt;int i, procs;&lt;BR /&gt;&lt;BR /&gt;for (procs =1; procs 4+1; procs++){&lt;BR /&gt;if (procs%1==0 || procs==1){&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads(procs);&lt;BR /&gt;m1 = (double*)malloc(len*len*sizeof(double));&lt;BR /&gt;m2 = (double*)malloc(len*len*sizeof(double));&lt;BR /&gt;m3 = (double*)malloc(len*len*sizeof(double));&lt;BR /&gt;&lt;BR /&gt;#pragma omp parallel for&lt;BR /&gt;for (i = 0; i&lt;BR /&gt;m1&lt;I&gt; = (i%10)-5;&lt;BR /&gt;m2&lt;I&gt; = (i%7)-3.5;&lt;BR /&gt;m3&lt;I&gt; = 0;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;t0 = omp_get_wtime();&lt;BR /&gt;cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, len, len, len, 1.0, m1, len, m2, len, 0.0, m3, len);&lt;BR /&gt;tf = omp_get_wtime();&lt;BR /&gt;time = tf-t0;&lt;BR /&gt;if (procs == 1) { tm1 = time; }&lt;BR /&gt;cout "Elapsed time: " time "	 - " procs " threads loop	 ratio:" time/tm1 endl;&lt;BR /&gt;&lt;BR /&gt;free(m1);&lt;BR /&gt;free(m2);&lt;BR /&gt;free(m3);&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;exit(0);&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;To compile:&lt;BR /&gt;/opt/intel/cc/9.0/bin/icc -openmp mklTest2.cxx -lmkl -L /opt/intel/mkl/8.0/lib/32/ -I /opt/intel/mkl/8.0/include/&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Timings I got on a 32p machine:&lt;BR /&gt;./a.out&lt;BR /&gt;Elapsed time: 1.19079 - 1 threads loop ratio:1&lt;BR /&gt;Elapsed time: 1.18762 - 2 threads loop ratio:0.997338&lt;BR /&gt;Elapsed time: 1.18804 - 3 threads loop ratio:0.997687&lt;BR /&gt;Elapsed time: 1.21605 - 4 threads loop ratio:1.02121&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;</description>
      <pubDate>Tue, 04 Apr 2006 03:03:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937106#M14185</guid>
      <dc:creator>joan_puig</dc:creator>
      <dc:date>2006-04-04T03:03:51Z</dc:date>
    </item>
    <item>
      <title>Re: MKL can't get any scaling</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937107#M14186</link>
      <description>It looks like the code is giving good performance for 1p, but it doesn't scale at all after that.&lt;BR /&gt;&lt;BR /&gt;I was wondering if there is any switch that I need to enable so that MKL will be multithreaded. If there isn't, is there something simple I am missing in my code?&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;&lt;BR /&gt;Joan</description>
      <pubDate>Tue, 04 Apr 2006 03:06:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937107#M14186</guid>
      <dc:creator>joan_puig</dc:creator>
      <dc:date>2006-04-04T03:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: MKL can't get any scaling</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937108#M14187</link>
      <description>What settings are you using for OMP_NUM_THREADS and KMP_SERIAL?&lt;BR /&gt;Are you asking all the threads you created to share the same memory regions, and asking MKL to create as many additional threads as possible?</description>
      <pubDate>Tue, 04 Apr 2006 03:52:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937108#M14187</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2006-04-04T03:52:41Z</dc:date>
    </item>
    <item>
      <title>Re: MKL can't get any scaling</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937109#M14188</link>
      <description>Hi Tim, thanks for your reply, it provided me with the pointer to what I needed to change to make it all work.Now, I think this is might be an mkl bug:&lt;BR /&gt;&lt;BR /&gt;I don't set OMP_NUM_THREADS&lt;BR /&gt;My code uses the omp_set_num_threads()&lt;BR /&gt;It seems though that unless OMP_NUM_THREADS is set to something at the beggining of the program it won't honor any future calls to omp_set_num_threads()&lt;BR /&gt;Now, if I take out the call to the MKL function, the plain openmp for loop will actually be parallelized well.&lt;BR /&gt;&lt;BR /&gt;[jpuig@altix jpuig]$ export -n OMP_NUM_THREADS&lt;BR /&gt;[jpuig@altix jpuig]$ ./a.out&lt;BR /&gt;Elapsed time: 1.8064 - 1 threads loop ratio:1&lt;BR /&gt;Elapsed time: 1.79981 - 2 threads loop ratio:0.996353&lt;BR /&gt;Elapsed time: 1.85461 - 3 threads loop ratio:1.02669&lt;BR /&gt;Elapsed time: 1.82016 - 4 threads loop ratio:1.00762&lt;BR /&gt;[jpuig@altix jpuig]$ export OMP_NUM_THREADS=4&lt;BR /&gt;[jpuig@altix jpuig]$ ./a.out&lt;BR /&gt;Elapsed time: 1.84285 - 1 threads loop ratio:1&lt;BR /&gt;Elapsed time: 0.929641 - 2 threads loop ratio:0.504457&lt;BR /&gt;Elapsed time: 0.62401 - 3 threads loop ratio:0.338611&lt;BR /&gt;Elapsed time: 0.476085 - 4 threads loop ratio:0.258341&lt;BR /&gt;[jpuig@altix jpuig]$&lt;P&gt;Message Edited by joan.puig@gmail.com on &lt;SPAN class="date_text"&gt;04-03-2006&lt;/SPAN&gt;&lt;SPAN class="time_text"&gt;02:58 PM&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2006 04:53:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-can-t-get-any-scaling/m-p/937109#M14188</guid>
      <dc:creator>joan_puig</dc:creator>
      <dc:date>2006-04-04T04:53:20Z</dc:date>
    </item>
  </channel>
</rss>

