<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Massive Slowdown in cblas_scal for Intel 2020/2021 in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1294943#M31636</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The oneMKL 2021.3 version is now available to download. Can you please try on the latest version and let us know if the issue still persists?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;
&lt;P&gt;Rajesh.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 01 Jul 2021 07:35:04 GMT</pubDate>
    <dc:creator>MRajesh_intel</dc:creator>
    <dc:date>2021-07-01T07:35:04Z</dc:date>
    <item>
      <title>Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248660#M30733</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;We've tracked a severe performance hit in our codes to the function cblas_scal.&amp;nbsp; The efficiency hit shows up starting in Intel MKL 2020 and still occurs with Intel 2021.&amp;nbsp; It seems to occur when calling cblas_cscal from a threaded region and does not seem to occur when calling cblas_cscal from a non-threaded region.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Attached is a test case that we ran on our linux cluster with results for Intel 2018, 2019, 2020, and 2021. &amp;nbsp; We compiled with&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; icc -O2 -qopenmp -mkl cblas_test.cpp&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The basic timing results are (with 16 threads in the thread regions):&lt;/P&gt;
&lt;P&gt;MKL VERSION&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2018.0.03&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2019.0.4&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2020.0.4&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2021.1&lt;BR /&gt;TIME(s) for Non-Threaded:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 20.07&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 23.42 &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 20.68&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 20.08&lt;BR /&gt;TIME(s) for Threaded&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.67&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.89&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 38.54&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 38.30&lt;/P&gt;
&lt;P&gt;Note the catastrophic slowdown in 2020/2021 when cblas_cscal is called from threading where it is even slower than the non-threaded loop and is almost 20 times slower than the corresponding times in 2018/2019.&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;John&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2021 16:40:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248660#M30733</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-01-21T16:40:55Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_dscal/cblas_dcopy for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248707#M30734</link>
      <description>&lt;P&gt;The test case in my original post had some errors.&amp;nbsp; The single complex version cblas_cscal was being called instead of the the double real version.&amp;nbsp; I've attached a corrected test case compiled with:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; icc -qopenmp -O2 -mkl&amp;nbsp; cblas_test.cpp scale.cpp&lt;/P&gt;
&lt;P&gt;The efficiency issue still remains.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also tested cblas_dcopy and I see the same type of issue.&amp;nbsp; For 2018/2019, dcopy in the threaded loops shows significant improvement over the non-threaded loops.&amp;nbsp; However, for 2020/2021 calling dcopy in the threaded loops is twice as slow as calling dcopy from the non-threaded loops.&amp;nbsp; If you replace dcopy with direct copies, then no slowdown is observed.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2021 19:36:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248707#M30734</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-01-21T19:36:19Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_dscal/cblas_dcopy for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248821#M30736</link>
      <description>&lt;P&gt;Here are timings (seconds) on an Intel NUC with an i7-10710U processor (6 cores, 12 threads) running Windows 10-64.&lt;/P&gt;
&lt;TABLE border="1" width="100%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;2020.0.4&lt;/TD&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;2021.1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;Non-Threaded&lt;/TD&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;13.68&lt;/TD&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;12.87&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;Threaded&lt;/TD&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;14.79&lt;/TD&gt;
&lt;TD width="33.333333333333336%" height="23px"&gt;13.92&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;</description>
      <pubDate>Fri, 22 Jan 2021 02:54:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248821#M30736</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2021-01-22T02:54:38Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_dscal/cblas_dcopy for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248947#M30739</link>
      <description>&lt;P&gt;Hi mecej4,&lt;/P&gt;
&lt;P&gt;The threaded timings are not nearly as poor as on our linux cluster. However, they indicate no improvement by threading.&amp;nbsp; Is it possible that you can run data for Intel 2018 and Intel 2019 to verify that the threading produces efficient results?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The results I see indicate that there is some (major) threading issue that crept into MKL somewhere in the 2020 release (not sure which exact update).&lt;/P&gt;
&lt;P&gt;I've also been able to verify that the same issue arises with the la_getrs function.&amp;nbsp; I'm guessing this means it probably affects many MKL functions.&lt;/P&gt;
&lt;P&gt;John&lt;/P&gt;</description>
      <pubDate>Fri, 22 Jan 2021 12:30:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248947#M30739</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-01-22T12:30:22Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_dscal/cblas_dcopy for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248952#M30740</link>
      <description>&lt;P&gt;Here are timings if I replace the MKL cblas_dscal call with a plain loop&lt;/P&gt;
&lt;P&gt;MKL VERSION&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2018.0.03&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2019.0.4&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2020.0.4&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2021.1&lt;BR /&gt;TIME(s) for Non-Threaded:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 9.21 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 8.50&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.49&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.49&lt;BR /&gt;TIME(s) for Threaded :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.89&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.79&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.81&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.82&lt;/P&gt;
&lt;P&gt;Actually, in this case, there really isn't any mkl involved. But, this is to help show that there is an issue with threading efficiency in MKL 2020 and 2021 that was not present in 2019 and before.&lt;/P&gt;
&lt;P&gt;For such simple OpenMP code, there is no reason that MKL 2020 and 2021 blas functions shouldn't have much better threading efficiency.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 22 Jan 2021 12:41:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248952#M30740</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-01-22T12:41:20Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_dscal/cblas_dcopy for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248956#M30741</link>
      <description>&lt;P&gt;John_Young:&lt;/P&gt;
&lt;P&gt;The only older Parallel Studio version that I have installed on this rather new Intel NUC is 2013 SP1, dated circa 2014 -- I felt that the 2017-2019 versions were not worth copying and reinstalling from a now-retired PC, but the 2013 SP1 was needed to support some other software that I use.&lt;/P&gt;
&lt;P&gt;I changed one line in your program:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;std::cout &amp;lt;&amp;lt; std::string(buf) &amp;lt;&amp;lt; "\n";&lt;/LI-CODE&gt;
&lt;P&gt;to&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;std::cout &amp;lt;&amp;lt; buf &amp;lt;&amp;lt; "\n";&lt;/LI-CODE&gt;
&lt;P&gt;in order to please the older Intel C compiler, but that should not affect the timings.&lt;/P&gt;
&lt;P&gt;Here are the results:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;S:\LANG\MKL&amp;gt;cblas_test
Intel(R) Math Kernel Library Version 11.1.4 Product Build 20140806 for Intel(R) 64 architecture applications
OpenMP procs/maxThreads = 12 / 12
TIME for Non-Threaded: 10.326
TIME for Threaded: 2.42484
DONE&lt;/LI-CODE&gt;
&lt;P&gt;These results strongly reinforce your findings on Linux.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 22 Jan 2021 12:57:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248956#M30741</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2021-01-22T12:57:27Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_dscal/cblas_dcopy for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248963#M30742</link>
      <description>&lt;P&gt;Thanks for checking an earlier version.&lt;/P&gt;
&lt;P&gt;Could this issue be escalated to the development team?&lt;/P&gt;</description>
      <pubDate>Fri, 22 Jan 2021 13:48:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1248963#M30742</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-01-22T13:48:36Z</dc:date>
    </item>
    <item>
      <title>Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1249633#M30755</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for reporting this issue. We are forwarding this query to the MKL experts. They will get back to you.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 25 Jan 2021 11:52:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1249633#M30755</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2021-01-25T11:52:56Z</dc:date>
    </item>
    <item>
      <title>Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1284250#M31364</link>
      <description>&lt;P&gt;I tested on both Windows and Linux systems.  I was to reproduce the issue on the Windows system, not on the Linux system:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;TABLE class="ql-table-blob" border="0" style="width: 270pt;" width="359"&gt;
 &lt;COLGROUP&gt;&lt;COL width="185" style="mso-width-source:userset;mso-width-alt:6469;width:139pt" /&gt;
 &lt;COL width="85" style="mso-width-source:userset;mso-width-alt:2978;width:64pt" /&gt;
 &lt;COL width="89" style="mso-width-source:userset;mso-width-alt:3118;width:67pt" /&gt;
 &lt;/COLGROUP&gt;&lt;TBODY&gt;&lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" class="xl63" width="185" style="height:14.5pt;width:139pt"&gt;MKL
  Version&lt;/TD&gt;
  &lt;TD width="85" style="width:64pt"&gt;2020.0.4&lt;/TD&gt;
  &lt;TD width="89" style="width:67pt"&gt;2021.2.0&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" class="xl63" style="height:14.5pt"&gt;Windows 10&amp;nbsp;&lt;/TD&gt;
  &lt;TD&gt;&lt;/TD&gt;
  &lt;TD&gt;&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" style="height:14.5pt"&gt;Non-Threaded&lt;/TD&gt;
  &lt;TD align="right"&gt;17.65&lt;/TD&gt;
  &lt;TD align="right"&gt;16.2&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" style="height:14.5pt"&gt;Threaded&lt;/TD&gt;
  &lt;TD align="right"&gt;19.97&lt;/TD&gt;
  &lt;TD align="right"&gt;18.61&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" style="height:14.5pt"&gt;&lt;/TD&gt;
  &lt;TD&gt;&lt;/TD&gt;
  &lt;TD&gt;&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" class="xl63" style="height:14.5pt"&gt;Linux Ubuntu 18.4 LTS&lt;/TD&gt;
  &lt;TD&gt;&lt;/TD&gt;
  &lt;TD&gt;&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" style="height:14.5pt"&gt;Non-Threaded&lt;/TD&gt;
  &lt;TD align="right"&gt;14.52&lt;/TD&gt;
  &lt;TD align="right"&gt;14.62&lt;/TD&gt;
 &lt;/TR&gt;
 &lt;TR height="19" style="height:14.5pt"&gt;
  &lt;TD height="19" style="height:14.5pt"&gt;Threaded&lt;/TD&gt;
  &lt;TD align="right"&gt;13.79&lt;/TD&gt;
  &lt;TD align="right"&gt;13.52&lt;/TD&gt;
 &lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;I was not able to get access to a cluster to test.  I just tested on single-processors systems.&lt;/P&gt;&lt;P&gt;I was able to reproduce the issue on the Windows system, not on the Linux system.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 24 May 2021 23:55:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1284250#M31364</guid>
      <dc:creator>Khang_N_Intel</dc:creator>
      <dc:date>2021-05-24T23:55:45Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1286204#M31418</link>
      <description>&lt;P&gt;Hi Khang,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I apologize I could not reply sooner. Thank you for looking at this problem.&amp;nbsp; This is still a major issue for our codes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Without knowing how many threads you were using, I cannot say 100%, but I would say that both your Windows and Linux systems exhibit the issue.&amp;nbsp; For example, if you were using two threads on Linux, then your threaded timing should have dropped to around 8 seconds (if 4 threads, then you should see timings around 4 seconds).&amp;nbsp; Even though you saw a small speedup on Linux instead of a slowdown, the parallel efficiency is terrible.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you are able, please try to run the same simulation using Intel MKL 2019.&amp;nbsp; I think you would see the threaded timings drop significantly.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you,&lt;/P&gt;
&lt;P&gt;John&lt;/P&gt;</description>
      <pubDate>Tue, 01 Jun 2021 16:37:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1286204#M31418</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-06-01T16:37:47Z</dc:date>
    </item>
    <item>
      <title>Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1286227#M31419</link>
      <description>&lt;P&gt;Hi John,&lt;/P&gt;&lt;P&gt;The issue will be addressed in the upcoming release of oneMKL, 2021.3.&lt;/P&gt;&lt;P&gt;Khang&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 01 Jun 2021 18:02:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1286227#M31419</guid>
      <dc:creator>Khang_N_Intel</dc:creator>
      <dc:date>2021-06-01T18:02:31Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1286228#M31420</link>
      <description>&lt;P&gt;Khang,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That is good news. Thanks for letting us know.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;John&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Jun 2021 18:03:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1286228#M31420</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-06-01T18:03:53Z</dc:date>
    </item>
    <item>
      <title>Re: Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1294943#M31636</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The oneMKL 2021.3 version is now available to download. Can you please try on the latest version and let us know if the issue still persists?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;
&lt;P&gt;Rajesh.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Jul 2021 07:35:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1294943#M31636</guid>
      <dc:creator>MRajesh_intel</dc:creator>
      <dc:date>2021-07-01T07:35:04Z</dc:date>
    </item>
    <item>
      <title>Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1296643#M31694</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Can you please provide an update regarding the issue?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Rajesh.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 07 Jul 2021 12:38:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1296643#M31694</guid>
      <dc:creator>MRajesh_intel</dc:creator>
      <dc:date>2021-07-07T12:38:31Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1296645#M31695</link>
      <description>&lt;P&gt;Hi Rajesh,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We have put in a request, but we are still waiting on our system administrators to install the latest Intel libraries. &amp;nbsp;&amp;nbsp; As soon as the libraries are installed, I'll verify the issue is fixed.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;Best,&lt;/P&gt;
&lt;P&gt;John&lt;/P&gt;</description>
      <pubDate>Wed, 07 Jul 2021 12:43:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1296645#M31695</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-07-07T12:43:28Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1297809#M31736</link>
      <description>&lt;P&gt;Rajesh,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The issue seems to be fixed in 2021.3.&amp;nbsp; Thanks for your help.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Best,&lt;/P&gt;
&lt;P&gt;John&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jul 2021 12:48:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1297809#M31736</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-07-12T12:48:05Z</dc:date>
    </item>
    <item>
      <title>Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1297814#M31738</link>
      <description>&lt;P&gt;&amp;nbsp;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for the confirmation!&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;As this issue has been resolved, we will no longer respond to this thread. If you require any additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Have a Good day.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Rajesh&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 12 Jul 2021 13:10:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1297814#M31738</guid>
      <dc:creator>MRajesh_intel</dc:creator>
      <dc:date>2021-07-12T13:10:17Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Massive Slowdown in cblas_scal for Intel 2020/2021</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1297833#M31742</link>
      <description>&lt;P&gt;Here are the final timings (in seconds) with the fixed code:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;cblas_dscal&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;MKL Version : 2018.0.3&amp;nbsp; 2019.0.4&amp;nbsp; 2020.0.4&amp;nbsp; 2021.1&amp;nbsp;&amp;nbsp; 2021.3&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Non-Threaded: 19.59&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; 22.33&amp;nbsp; &amp;nbsp;&amp;nbsp; 17.84&amp;nbsp;&amp;nbsp;&amp;nbsp; 18.38 &amp;nbsp; 17.75&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Threaded&amp;nbsp;&amp;nbsp;&amp;nbsp; :&amp;nbsp; 1.60 &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp; 2.09 &amp;nbsp; &amp;nbsp; 37.56 &amp;nbsp;&amp;nbsp; 37.97&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.52&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;cblas_dcopy&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;MKL Version : 2018.0.3&amp;nbsp; 2019.0.4&amp;nbsp; 2020.0.4&amp;nbsp; 2021.1&amp;nbsp;&amp;nbsp; 2021.3&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Non-Threaded: 20.55 &amp;nbsp; &amp;nbsp;&amp;nbsp; 24.77&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 23.26 &amp;nbsp; &amp;nbsp; 22.82&amp;nbsp;&amp;nbsp;&amp;nbsp; 22.75&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Threaded&amp;nbsp;&amp;nbsp;&amp;nbsp; :&amp;nbsp; 1.81&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; 2.27&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 42.54&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 42.22&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.91&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jul 2021 14:05:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Massive-Slowdown-in-cblas-scal-for-Intel-2020-2021/m-p/1297833#M31742</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2021-07-12T14:05:27Z</dc:date>
    </item>
  </channel>
</rss>

