<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to set affinity of threads spawned by MKL? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884467#M9921</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;MKL User's Guide has a section with examples on setting affinity mask by means of operating system. The section should be named like "Managing Performance and Memory&amp;gt;Tips and Techniques to Improve Performance&amp;gt;Managing Multi-Core Performance". Have in mind that affinity mask is per-thread attribute (on Linux, at least), so it should be set &lt;EM&gt;after&lt;/EM&gt; the top level OpenMP threads are initiated.&lt;BR /&gt;&lt;BR /&gt;Hope this helps&lt;BR /&gt;Thanks&lt;BR /&gt;Dima</description>
    <pubDate>Wed, 10 Jun 2009 05:55:12 GMT</pubDate>
    <dc:creator>Dmitry_B_Intel</dc:creator>
    <dc:date>2009-06-10T05:55:12Z</dc:date>
    <item>
      <title>How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884463#M9917</link>
      <description>I have a program which invokes MKL from within an OpenMP parallel region. It sets $MKL_DYNAMIC and $MKL_NUM_THREADS so that MKL will exploit nested parallelism, and calls MKL to work on different sets of data from different OpenMP threads. Is it possible to set the affinity mask of threads spawned by MKL from a specific function call?</description>
      <pubDate>Tue, 09 Jun 2009 19:23:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884463#M9917</guid>
      <dc:creator>styc</dc:creator>
      <dc:date>2009-06-09T19:23:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884464#M9918</link>
      <description>&lt;DIV style="margin: 0px; height: auto;"&gt;&lt;/DIV&gt;
You may be able to set the environment variable KMP_AFFINITY or GOMP_AFFINITY prior to the parallel region. I don't think this will be effective when MKL_DYNAMIC is set. If these are your questions, it would be good to have an answer from the library experts.&lt;BR /&gt;I'm wondering why I don't find documentation on KMP_AFFINITY=physical, which appears to be the favored setting for HyperThreading.&lt;BR /&gt;</description>
      <pubDate>Tue, 09 Jun 2009 21:00:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884464#M9918</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-06-09T21:00:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884465#M9919</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/367365"&gt;tim18&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; You may be able to set the environment variable KMP_AFFINITY or GOMP_AFFINITY prior to the parallel region. I don't think this will be effective when MKL_DYNAMIC is set. If these are your questions, it would be good to have an answer from the library experts.&lt;BR /&gt;I'm wondering why I don't find documentation on KMP_AFFINITY=physical, which appears to be the favored setting for HyperThreading.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
My program sets MKL_DYNAMIC to FALSE. KMP_AFFINITY is basically something I try to avoid because they don't seem to work on AMD machines. What I hope to see is that threads executing a call to MKL will inherit the affinity mask of the calling OpenMP thread or can have their affinity masks specified (perhaps through some sched_setaffinity magic?).&lt;BR /&gt;</description>
      <pubDate>Tue, 09 Jun 2009 22:32:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884465#M9919</guid>
      <dc:creator>styc</dc:creator>
      <dc:date>2009-06-09T22:32:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884466#M9920</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/404979"&gt;styc&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
My program sets MKL_DYNAMIC to FALSE. KMP_AFFINITY is basically something I try to avoid because they don't seem to work on AMD machines. What I hope to see is that threads executing a call to MKL will inherit the affinity mask of the calling OpenMP thread or can have their affinity masks specified (perhaps through some sched_setaffinity magic?).&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
OK, then MKL_DYNAMIC should not be interfering. When I set KMP_AFFINITY=compact,0,verbose with the 10.1 compiler on a recent AMD machine, it gives me the non-support message, but tells me it is setting affinity as if there are 8 single core CPUs. This is effectively the same as taskset -c 0-7, as far as I can see. I don't see any reasonable behavior other than for the same affinity mask to persist in the nested OpenMP. According to the doc, sched_setaffinity() would be the mechanism used for KMP_AFFINITY, so what you see by sched_getaffinity() should be what MKL is using under OMP_NESTED, subject to its own determination of how many additional threads to use.&lt;BR /&gt;I agree with your implication that failing to support affinity mask in a similar way on Intel and AMD platforms would be a serious deficiency.&lt;BR /&gt;</description>
      <pubDate>Tue, 09 Jun 2009 22:56:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884466#M9920</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-06-09T22:56:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884467#M9921</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;MKL User's Guide has a section with examples on setting affinity mask by means of operating system. The section should be named like "Managing Performance and Memory&amp;gt;Tips and Techniques to Improve Performance&amp;gt;Managing Multi-Core Performance". Have in mind that affinity mask is per-thread attribute (on Linux, at least), so it should be set &lt;EM&gt;after&lt;/EM&gt; the top level OpenMP threads are initiated.&lt;BR /&gt;&lt;BR /&gt;Hope this helps&lt;BR /&gt;Thanks&lt;BR /&gt;Dima</description>
      <pubDate>Wed, 10 Jun 2009 05:55:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884467#M9921</guid>
      <dc:creator>Dmitry_B_Intel</dc:creator>
      <dc:date>2009-06-10T05:55:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884468#M9922</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/93647"&gt;Dmitry Baksheev (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; &lt;BR /&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;MKL User's Guide has a section with examples on setting affinity mask by means of operating system. The section should be named like "Managing Performance and Memory&amp;gt;Tips and Techniques to Improve Performance&amp;gt;Managing Multi-Core Performance". Have in mind that affinity mask is per-thread attribute (on Linux, at least), so it should be set &lt;EM&gt;after&lt;/EM&gt; the top level OpenMP threads are initiated.&lt;BR /&gt;&lt;BR /&gt;Hope this helps&lt;BR /&gt;Thanks&lt;BR /&gt;Dima&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
I tried that that, but it did not quite work. I pinned an OpenMP thread to a core (other threads were simply put to wait on a "#pragma omp barrier"), then called DGEMM from it and expected all MKL threads to get stuffed onto one core. But it seemed that MKL did not quite honor the affinity mask I set---the threads were spread over all cores. Of course this looks crazy. But given that, I really don't know what to do so that on a dual-socket quad-core machine, I can have one (physical) processor handle one DGEMM call and the other processor handle another call from inside the same parallel region.&lt;BR /&gt;</description>
      <pubDate>Fri, 12 Jun 2009 01:24:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884468#M9922</guid>
      <dc:creator>styc</dc:creator>
      <dc:date>2009-06-12T01:24:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884469#M9923</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Hi styc,&lt;BR /&gt;&lt;BR /&gt;The instructions in the MKL User's Guide seem to be incomplete. The code snippet in the MKL User's Guide is apparently missing correct thread identification: instead of getpid() one should use syscall(SYS_gettid). Another issue is thatOpenMP layer appliesin terms of OpenMP threads while theyare dynamically mapped toOS threads. This issue can be worked around by settingenvvar KMP_AFFINITY=disabled (see&lt;A href="http://www.intel.com/software/products/compilers/docs/flin/main_for/mergedprojects/optaps_for/common/optaps_openmp_thread_affinity.htm"&gt;Thread Affinity Interface&lt;/A&gt;) - this may have perfromance implications though, I don't know.&lt;BR /&gt;&lt;BR /&gt;In summary, could you try this function for binding current thread to cpus?&lt;BR /&gt;&lt;BR /&gt;
&lt;P&gt;// Handle up to 32 cpus&lt;BR /&gt;void bind_me_to(unsigned cpumask)&lt;BR /&gt;{&lt;BR /&gt; cpu_set_t mask;&lt;BR /&gt; pid_t tid = syscall(SYS_gettid);&lt;BR /&gt; int cpuid;&lt;/P&gt;
&lt;P&gt; CPU_ZERO(&amp;amp;mask);&lt;BR /&gt; for (cpuid=0; cpuid &amp;lt; 32; cpuid++)&lt;BR /&gt; {&lt;BR /&gt; if (cpumask &amp;amp; (1&amp;lt;&lt;CPUID&gt;&lt;/CPUID&gt; CPU_SET(cpuid, &amp;amp;mask);&lt;BR /&gt; }&lt;BR /&gt; sched_setaffinity(tid, sizeof(mask), &amp;amp;mask);&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;This function is assumed to be called in the following setup, ifI understood you correctly (ensure envvars OMP_DYNAMIC=false and MKL_DYNAMIC=false to allow MKL thread in nested parallel regions):&lt;BR /&gt;&lt;BR /&gt;#pragma omp parallel default(shared) num_threads(2)&lt;BR /&gt; {&lt;BR /&gt; int omp_tid = omp_get_thread_num();&lt;BR /&gt; omp_set_nested(1); // nested parallel regions should be enabled&lt;BR /&gt; if (omp_tid==0)&lt;BR /&gt; {&lt;BR /&gt; bind_me_to(0x0f); // four threads on one socket&lt;BR /&gt; omp_set_num_threads(4);&lt;BR /&gt; do_dgemm();&lt;BR /&gt; }&lt;BR /&gt; if (omp_tid==1)&lt;BR /&gt; {&lt;BR /&gt; bind_me_to(0xf0); // four threads on another socket&lt;BR /&gt; omp_set_num_threads(4);&lt;BR /&gt; do_fft();&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;I hope this will help&lt;BR /&gt;Thanks&lt;BR /&gt;Dima&lt;/P&gt;</description>
      <pubDate>Mon, 15 Jun 2009 07:20:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884469#M9923</guid>
      <dc:creator>Dmitry_B_Intel</dc:creator>
      <dc:date>2009-06-15T07:20:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to set affinity of threads spawned by MKL?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884470#M9924</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/93647"&gt;Dmitry Baksheev (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; &lt;BR /&gt;Hi styc,&lt;BR /&gt;&lt;BR /&gt;The instructions in the MKL User's Guide seem to be incomplete. The code snippet in the MKL User's Guide is apparently missing correct thread identification: instead of getpid() one should use syscall(SYS_gettid). Another issue is thatOpenMP layer appliesin terms of OpenMP threads while theyare dynamically mapped toOS threads. This issue can be worked around by settingenvvar KMP_AFFINITY=disabled (see&lt;A href="http://www.intel.com/software/products/compilers/docs/flin/main_for/mergedprojects/optaps_for/common/optaps_openmp_thread_affinity.htm"&gt;Thread Affinity Interface&lt;/A&gt;) - this may have perfromance implications though, I don't know.&lt;BR /&gt;&lt;BR /&gt;In summary, could you try this function for binding current thread to cpus?&lt;BR /&gt;&lt;BR /&gt;
&lt;P&gt;// Handle up to 32 cpus&lt;BR /&gt;void bind_me_to(unsigned cpumask)&lt;BR /&gt;{&lt;BR /&gt; cpu_set_t mask;&lt;BR /&gt; pid_t tid = syscall(SYS_gettid);&lt;BR /&gt; int cpuid;&lt;/P&gt;
&lt;P&gt;CPU_ZERO(&amp;amp;mask);&lt;BR /&gt; for (cpuid=0; cpuid &amp;lt; 32; cpuid++)&lt;BR /&gt; {&lt;BR /&gt; if (cpumask &amp;amp; (1&amp;lt;&lt;CPUID&gt;&lt;/CPUID&gt; CPU_SET(cpuid, &amp;amp;mask);&lt;BR /&gt; }&lt;BR /&gt; sched_setaffinity(tid, sizeof(mask), &amp;amp;mask);&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;This function is assumed to be called in the following setup, ifI understood you correctly (ensure envvars OMP_DYNAMIC=false and MKL_DYNAMIC=false to allow MKL thread in nested parallel regions):&lt;BR /&gt;&lt;BR /&gt;#pragma omp parallel default(shared) num_threads(2)&lt;BR /&gt; {&lt;BR /&gt; int omp_tid = omp_get_thread_num();&lt;BR /&gt; omp_set_nested(1); // nested parallel regions should be enabled&lt;BR /&gt; if (omp_tid==0)&lt;BR /&gt; {&lt;BR /&gt; bind_me_to(0x0f); // four threads on one socket&lt;BR /&gt; omp_set_num_threads(4);&lt;BR /&gt; do_dgemm();&lt;BR /&gt; }&lt;BR /&gt; if (omp_tid==1)&lt;BR /&gt; {&lt;BR /&gt; bind_me_to(0xf0); // four threads on another socket&lt;BR /&gt; omp_set_num_threads(4);&lt;BR /&gt; do_fft();&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;I hope this will help&lt;BR /&gt;Thanks&lt;BR /&gt;Dima&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
It seems that key is "KMP_AFFINITY=disabled". The program works as I suppose now. Thanks for your response!&lt;BR /&gt;</description>
      <pubDate>Mon, 15 Jun 2009 17:37:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-affinity-of-threads-spawned-by-MKL/m-p/884470#M9924</guid>
      <dc:creator>styc</dc:creator>
      <dc:date>2009-06-15T17:37:39Z</dc:date>
    </item>
  </channel>
</rss>

