<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Concurrency Problem with Intel MKL BLAS in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830300#M5486</link>
    <description>Yes ! My bad, I meant liomp5&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;So as I see it :&lt;/DIV&gt;&lt;DIV&gt;1. Threading might help but not much&lt;/DIV&gt;&lt;DIV&gt;2. There is no point in adding threads to BLAS Level 2 Operations&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Is there any way at all to speed up this code?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;(Unless I move on to GPU computing ? )&lt;/DIV&gt;</description>
    <pubDate>Fri, 16 Sep 2011 19:45:55 GMT</pubDate>
    <dc:creator>nunoxic</dc:creator>
    <dc:date>2011-09-16T19:45:55Z</dc:date>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830295#M5481</link>
      <description>&lt;META http-equiv="content-type" content="text/html; charset=utf-8" /&gt;&lt;A href="http://software.intel.com/en-us/forums/showthread.php?t=86020"&gt;http://software.intel.com/en-us/forums/showthread.php?t=86020&lt;/A&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Didn't intend to multi-post but there seems to be no choice since I need inputs from MKL experts and VTune experts.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Thanks&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 Sep 2011 17:29:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830295#M5481</guid>
      <dc:creator>nunoxic</dc:creator>
      <dc:date>2011-09-14T17:29:12Z</dc:date>
    </item>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830296#M5482</link>
      <description>Please use the following environment settings while using libiomp5&lt;BR /&gt;1) set &lt;STRONG&gt;KMP_VERSION&lt;BR /&gt;&lt;/STRONG&gt; to see OpenMP run-time library version you are using&lt;BR /&gt;2) set &lt;STRONG&gt;KMP_AFFINITY=verbose,$KMP_AFFINITY&lt;/STRONG&gt;&lt;DIV&gt;  to see used affinity&lt;/DIV&gt;&lt;DIV&gt;&lt;BR /&gt;3) try &lt;STRONG&gt;KMP_AFFINITY=granularity=fine,compact,1,0&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV&gt; this is recommended affinity from MKL doc if SMT(HT)is enabled&lt;/DIV&gt;&lt;DIV&gt;&lt;BR /&gt;4) play with &lt;SPAN style="color: black;"&gt;&lt;STRONG&gt;KMP_BLOCKTIME&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="color: black;"&gt; Sets the time, in milliseconds, that a thread should wait, after completing the execution of a parallel region,before sleeping (default is 200 milliseconds)&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 15 Sep 2011 10:19:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830296#M5482</guid>
      <dc:creator>barragan_villanueva_</dc:creator>
      <dc:date>2011-09-15T10:19:02Z</dc:date>
    </item>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830297#M5483</link>
      <description>Thanks for your inputs but none of the above made any difference to the code&lt;BR /&gt;I played with KMP_BLOCKTIME for an hour or more. I set it to 0 200 inf and what not but it lead to nowhere. Somtimes it sped up the execution for a given input data but when the data was changed, the optimality was lost.&lt;BR /&gt;&lt;BR /&gt;What is the difference between linking using -libomp5 and -openmp&lt;BR /&gt;From my experiments, I found -libomp5 to be much much faster than -openmp.&lt;BR /&gt;&lt;BR /&gt;I tried to read up about KMP here :&lt;BR /&gt;&lt;A href="https://community.intel.com/../../sites/products/documentation/studio/composer/en-us/2009/compiler_c/optaps/common/optaps_openmp_thread_affinity.htm"&gt;http://software.intel.com/sites/products/documentation/studio/composer/en-us/2009/compiler_c/optaps/common/optaps_openmp_thread_affinity.htm&lt;/A&gt;&lt;BR /&gt;but it is going over my head. Are KMP and OMP different things or are they same things ?</description>
      <pubDate>Thu, 15 Sep 2011 16:32:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830297#M5483</guid>
      <dc:creator>nunoxic</dc:creator>
      <dc:date>2011-09-15T16:32:58Z</dc:date>
    </item>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830298#M5484</link>
      <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A jquery1316154085297="53" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=523892" href="https://community.intel.com/en-us/profile/523892/" class="basic"&gt;nunoxic&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;&lt;I&gt;What is the difference between linking using -libomp5 and -openmp&lt;BR /&gt;From my experiments, I found -libomp5 to be much much faster than -openmp.&lt;BR /&gt;&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;BR /&gt;It's strange :( In case of Intel compiler and mkl_intel_thread library there should be no differences.&lt;BR /&gt;So, what is link link command you are using?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2011 06:25:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830298#M5484</guid>
      <dc:creator>barragan_villanueva_</dc:creator>
      <dc:date>2011-09-16T06:25:25Z</dc:date>
    </item>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830299#M5485</link>
      <description>-libomp5 shouldn't work; did you mean -liomp5 ? The latter is set by ifort -openmp, but you would need to specify the library explicitly if you were using some other command for linking.&lt;BR /&gt;The KMP environment variables are specific to Intel OpenMP, while the OMP ones are in accordance with OpenMP standard.&lt;BR /&gt;A purpose of increasing KMP_BLOCKTIME would be to maintain KMP_AFFINITY settings across a gap of more than 0.2 second between OpenMP parallel regions. It's entirely possible that KMP_BLOCKTIME has little effect in normal circumstances.</description>
      <pubDate>Fri, 16 Sep 2011 06:36:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830299#M5485</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2011-09-16T06:36:20Z</dc:date>
    </item>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830300#M5486</link>
      <description>Yes ! My bad, I meant liomp5&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;So as I see it :&lt;/DIV&gt;&lt;DIV&gt;1. Threading might help but not much&lt;/DIV&gt;&lt;DIV&gt;2. There is no point in adding threads to BLAS Level 2 Operations&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Is there any way at all to speed up this code?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;(Unless I move on to GPU computing ? )&lt;/DIV&gt;</description>
      <pubDate>Fri, 16 Sep 2011 19:45:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830300#M5486</guid>
      <dc:creator>nunoxic</dc:creator>
      <dc:date>2011-09-16T19:45:55Z</dc:date>
    </item>
    <item>
      <title>Concurrency Problem with Intel MKL BLAS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830301#M5487</link>
      <description>If your application doesn't have enough inherent parallelism to benefit from threading, GPU is not a likely solution. It's true that BLAS level 2 operations, which normally would be vectorized, would need to operate on extremely large data sets to benefit from threaded parallelism internal to those operations. Thus it is normal to apply parallelism at a higher level (each thread performing independent entire level 2 operations).</description>
      <pubDate>Sat, 17 Sep 2011 15:03:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Concurrency-Problem-with-Intel-MKL-BLAS/m-p/830301#M5487</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2011-09-17T15:03:43Z</dc:date>
    </item>
  </channel>
</rss>

