<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Thank you very much Ying.  in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180951#M29324</link>
    <description>&lt;P&gt;Thank you very much Ying.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 07 Jun 2018 22:21:58 GMT</pubDate>
    <dc:creator>Gheibi__Sanaz</dc:creator>
    <dc:date>2018-06-07T22:21:58Z</dc:date>
    <item>
      <title>getting MKL thread IDs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180946#M29319</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;We have a problem regarding mkl threads and we really appreciate your valuable help.&amp;nbsp; we are using mkl function calls in the nested parallel region below:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;        omp_set_num_threads( NUM_OF_THREADS );
        omp_set_nested(1);
        omp_set_max_active_levels(2);


	#pragma omp parallel num_threads(2)
        {
                if (omp_get_thread_num() == 0){

                        mkl_set_num_threads_local(16);

                        printf("My ID is %d\n", omp_get_thread_num());
                       	cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
                        m, n, p, 1, pA, p, pB, n, 0, pC1, n);
                }else{
                        mkl_set_num_threads_local(16);

                        printf("My ID is %d\n", omp_get_thread_num());
                       	cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
                        m, n, p, 1, pD, p, pE, n, 0, pC2, n);

                }
        }
&lt;/PRE&gt;

&lt;P&gt;Using VTune Amplifier, we can verify that the correct number of 32 threads are produced. However, the output of the print statements is as follows:&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;My ID is 0
My ID is 1
&lt;/PRE&gt;

&lt;P&gt;It seems like we cannot access "mkl" threads using "omp_get_thread_num()". Is there any similar function for accessing thread IDs of mkl threads? Or is there a way to do that? (We need such information for affinity and thread placement decisions).&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thank you very much,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Sanaz&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 May 2018 20:22:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180946#M29319</guid>
      <dc:creator>Gheibi__Sanaz</dc:creator>
      <dc:date>2018-05-08T20:22:14Z</dc:date>
    </item>
    <item>
      <title>Hi Sanaz,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180947#M29320</link>
      <description>&lt;P&gt;Hi Sanaz,&lt;/P&gt;

&lt;P&gt;As i understand the MD is 0&amp;nbsp; and MD is 1&amp;nbsp; are from &lt;FONT face="Courier New"&gt;#pragma omp parallel &lt;STRONG&gt;num_threads(2) and &lt;/STRONG&gt;printf&lt;/FONT&gt;&lt;CODE class="plain"&gt;&lt;FONT face="Courier New"&gt;(&lt;/FONT&gt;&lt;/CODE&gt;&lt;CODE class="string"&gt;&lt;FONT face="Courier New"&gt;"My ID is %d\n"&lt;/FONT&gt;&lt;/CODE&gt;&lt;CODE class="plain"&gt;&lt;FONT face="Courier New"&gt;, omp_get_thread_num()); reflect that.&amp;nbsp; &lt;/FONT&gt;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE class="plain"&gt;&lt;FONT face="Courier New"&gt;​But it should be&amp;nbsp;ok to &lt;/FONT&gt;&lt;/CODE&gt;spawn&lt;CODE class="plain"&gt;&lt;FONT face="Courier New"&gt;&amp;nbsp;2 external OPENMP thread&amp;nbsp; and each of them spawn 16 MKL thread to&amp;nbsp;implement MKL function.&amp;nbsp; for example,&amp;nbsp;&lt;/FONT&gt;&lt;/CODE&gt;ensure envvars OMP_DYNAMIC=false and MKL_DYNAMIC=false to allow MKL thread in nested parallel regions).&lt;/P&gt;

&lt;P&gt;&lt;CODE class="plain"&gt;&lt;FONT face="Courier New"&gt;​You may refer to MKL user guide, which have some discussion about this&amp;nbsp; or&amp;nbsp;&lt;/FONT&gt;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;the article&amp;nbsp; &lt;A href="https://software.intel.com/en-us/articles/using-threaded-intel-mkl-in-multi-thread-application" target="_blank"&gt;https://software.intel.com/en-us/articles/using-threaded-intel-mkl-in-multi-thread-application&lt;/A&gt;&lt;BR /&gt;
	and&amp;nbsp;some &amp;nbsp;discussion in the forum like &amp;nbsp;: &lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/296195" target="_blank"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/296195&lt;/A&gt;&lt;BR /&gt;
	&lt;BR /&gt;
	Best Regards,&lt;BR /&gt;
	Ying&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 09 May 2018 01:58:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180947#M29320</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2018-05-09T01:58:45Z</dc:date>
    </item>
    <item>
      <title>Thank you very much Ying, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180948#M29321</link>
      <description>&lt;P&gt;Thank you very much Ying,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;The resources were very useful for setting the affinity of MKL threads. However, before trying to do the binding, we want to know which mkl threads execute each of the cblas_dgemm() functions. For example, using KMP_AFFINITY=verbose environment variable, we can observe that for example thread # 5 is bound to proc set{15}. But that doesn't give us much insight because we don't know what exactly this thread #5 is doing ( which of the cblas_dgemm() functions this thread is executing ). We will really appreciate your help regarding that.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Best Regards,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Sanaz&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 09 May 2018 17:14:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180948#M29321</guid>
      <dc:creator>Gheibi__Sanaz</dc:creator>
      <dc:date>2018-05-09T17:14:03Z</dc:date>
    </item>
    <item>
      <title>Hi Sanaz,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180949#M29322</link>
      <description>&lt;P&gt;Hi Sanaz,&lt;BR /&gt;
	&lt;BR /&gt;
	Right, you can't know what exactly thread is doing which of cblas_dgemm() function.&amp;nbsp; Or&amp;nbsp; you can't control&amp;nbsp;every&amp;nbsp;single&amp;nbsp;mkl internal threads in openMP nested environment.&amp;nbsp; But&amp;nbsp;&amp;nbsp;let's come back the original problem, you expected 2 task and&amp;nbsp; each task execute on half of your physical cpu cores, so get&amp;nbsp;best performance.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;As the paper mentioned,&amp;nbsp; you actually don't need to dive into every single mkl internal threads. the Linux os and KMP_AFFINITY can do that that for you.&lt;/P&gt;

&lt;P&gt;No sure if you already did that by environment , your code seems miss one key code :&amp;nbsp; &lt;STRONG&gt;&lt;SPAN class="fontstyle3"&gt;&lt;FONT size="2"&gt;mkl_set_dynamic(0); &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;&lt;SPAN class="fontstyle3"&gt;&lt;FONT size="2"&gt;after add that, you may see expected performance and CPU usage. &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="fontstyle0"&gt;&lt;STRONG&gt;&lt;FONT size="2"&gt;NOTE&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;FONT face="Verdana"&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT size="2"&gt;If your application uses OpenMP* threading, you may need to provide additional settings:&lt;/FONT&gt;&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN class="fontstyle1" style="font-size: 10pt;"&gt;• &lt;/SPAN&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT size="2"&gt;Set the environment variable &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;FONT size="2"&gt;&lt;SPAN class="fontstyle3"&gt;OMP_NESTED=TRUE&lt;/SPAN&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT face="Verdana"&gt;, or alternatively call &lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN class="fontstyle3"&gt;omp_set_nested(1)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;FONT face="Verdana"&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT size="2"&gt;, to&lt;BR /&gt;
	enable OpenMP nested parallelism.&lt;/FONT&gt;&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN class="fontstyle1" style="font-size: 10pt;"&gt;• &lt;/SPAN&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT size="2"&gt;Set the environment variable &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;FONT size="2"&gt;&lt;SPAN class="fontstyle3"&gt;MKL_DYNAMIC=FALSE&lt;/SPAN&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT face="Verdana"&gt;, or alternatively call &lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN class="fontstyle3"&gt;mkl_set_dynamic(0)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN class="fontstyle1"&gt;&lt;FONT face="Verdana" size="2"&gt;, to&lt;BR /&gt;
	prevent Intel MKL from dynamically reducing the number of OpenMP threads in nested parallel&lt;BR /&gt;
	regions.&lt;/FONT&gt;&lt;/SPAN&gt;&lt;BR style="text-transform: none; line-height: normal; text-indent: 0px; letter-spacing: normal; font-style: normal; font-variant: normal; font-weight: normal; word-spacing: 0px; white-space: normal; orphans: 2; widows: 2; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" /&gt;
	I attached one for your reference.&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 03:00:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180949#M29322</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2018-05-15T03:00:00Z</dc:date>
    </item>
    <item>
      <title>Attach the file</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180950#M29323</link>
      <description>&lt;P&gt;Attach the file&lt;/P&gt;

&lt;P&gt;&amp;nbsp;omp_set_nested(1);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; omp_set_max_active_levels(2);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; mkl_set_dynamic(0);&lt;BR /&gt;
	#pragma omp parallel num_threads(2)&lt;BR /&gt;
	&amp;nbsp;{&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (omp_get_thread_num() == 0){&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; mkl_set_num_threads(32);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("My ID is %d \n", omp_get_thread_num());&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,m, n, p, 1, A, p, B, n, 0, C1, n);&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }else{&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;

&lt;P&gt;Ying &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 03:03:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180950#M29323</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2018-05-15T03:03:46Z</dc:date>
    </item>
    <item>
      <title>Thank you very much Ying. </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180951#M29324</link>
      <description>&lt;P&gt;Thank you very much Ying.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jun 2018 22:21:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/getting-MKL-thread-IDs/m-p/1180951#M29324</guid>
      <dc:creator>Gheibi__Sanaz</dc:creator>
      <dc:date>2018-06-07T22:21:58Z</dc:date>
    </item>
  </channel>
</rss>

