<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance drop when upgraded from MKL 2017 to MKL 2025 in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705504#M37272</link>
    <description>&lt;P&gt;I've had a similar experience myself. In my tests, I saved 100 matrices of size 500x500 and measured how long it took to invert them (calling dgetrf followed by dgetri). The older version (2015) consistently completed in about 1 second, while the newer version (2025) was slower and less consistent, with times varying between 1 and 3 seconds.&lt;/P&gt;</description>
    <pubDate>Fri, 25 Jul 2025 13:15:33 GMT</pubDate>
    <dc:creator>Yang76</dc:creator>
    <dc:date>2025-07-25T13:15:33Z</dc:date>
    <item>
      <title>Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1700888#M37226</link>
      <description>&lt;P&gt;I recently upgraded a multithreaded computational library to use MKL 2025.0 instead of version 2017.3.210. Unfortunately, I have a &lt;SPAN&gt;notable&lt;/SPAN&gt; performance drop in overall computations. We do use a lot blas and vml functions on large float arrays.&lt;/P&gt;&lt;P&gt;We run on windows Server and on both AVX-512 and AVX2 Intel Xeon processors. Processors have 2 NUMA nodes. We usually create threads as much as the number of cores (tried both on one NUMA node and on the 2 NUMA nodes by modifying the thread affinity). Please note we use MKL sequential by linking with the sequential DLL.&lt;/P&gt;&lt;P&gt;I cannot find the change in MKL release which may causes this behavior change. Please could you advice?&lt;/P&gt;</description>
      <pubDate>Tue, 01 Jul 2025 15:33:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1700888#M37226</guid>
      <dc:creator>hatdal</dc:creator>
      <dc:date>2025-07-01T15:33:30Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1700906#M37227</link>
      <description>&lt;P&gt;Could you please share a sample code that can reproduce the performance drop?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Jul 2025 17:19:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1700906#M37227</guid>
      <dc:creator>Fengrui</dc:creator>
      <dc:date>2025-07-01T17:19:16Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1701063#M37229</link>
      <description>&lt;P&gt;Hi Fengrui,&lt;/P&gt;&lt;P&gt;Our library is a huge C++ library and does a lot of computation for market finance (prices, indicators..) and along many steps, and honestly for the moment I didn't find out where exactly we have that drop of performance even I tried to use Intel tools. It seems spread on all computation steps. However, I remarked when decreasing the thread number, performances get better. Of course, I do not call mkl_set_num_threads or any other MKL service routine since I use the sequential MKL and I see in MKL traces instructions set (AVX..) are properly detected. The only difference I remarked between 2017 and 2025 version, is that for VML functions, MKL 2017 loads mkl_vml_def.dll vs MKL 2025 loads mkl_vml_avx512.dll (I am on avx512 processor). Last thing, we do a lot of memory aligned-on-128 allocation/deallocation by using mkl_malloc. A&lt;SPAN&gt;ll ideas of investigation are welcome&lt;/SPAN&gt;. Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 02 Jul 2025 08:31:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1701063#M37229</guid>
      <dc:creator>hatdal</dc:creator>
      <dc:date>2025-07-02T08:31:01Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1701138#M37234</link>
      <description>&lt;P&gt;I would recommend to turn on the verbose mode of oneMKL, that is to run with env variable MKL_VERBOSE=1, to see if there is noticeable performance drop of BLAS functions. VML functions don't support it though. It might need to create testing codes with real-case data for those VML functions&lt;/P&gt;</description>
      <pubDate>Wed, 02 Jul 2025 18:04:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1701138#M37234</guid>
      <dc:creator>Fengrui</dc:creator>
      <dc:date>2025-07-02T18:04:59Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1704820#M37266</link>
      <description>&lt;P&gt;This looks related:&amp;nbsp;&lt;A href="https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-in-BLAS-dot-product-function-in-MKL-2025/m-p/1704817" target="_blank"&gt;https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-in-BLAS-dot-product-function-in-MKL-2025/m-p/1704817&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2025 11:29:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1704820#M37266</guid>
      <dc:creator>mahalex</dc:creator>
      <dc:date>2025-07-22T11:29:24Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705504#M37272</link>
      <description>&lt;P&gt;I've had a similar experience myself. In my tests, I saved 100 matrices of size 500x500 and measured how long it took to invert them (calling dgetrf followed by dgetri). The older version (2015) consistently completed in about 1 second, while the newer version (2025) was slower and less consistent, with times varying between 1 and 3 seconds.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jul 2025 13:15:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705504#M37272</guid>
      <dc:creator>Yang76</dc:creator>
      <dc:date>2025-07-25T13:15:33Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705515#M37273</link>
      <description>&lt;P&gt;Even more concerning is the total processor time, which was just 2 seconds in the old version but almost always exceeds 10 seconds in the new version.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jul 2025 14:17:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705515#M37273</guid>
      <dc:creator>Yang76</dc:creator>
      <dc:date>2025-07-25T14:17:08Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705530#M37274</link>
      <description>&lt;P&gt;I found that, if I disable OMP multithreading by setting the environment variable OMP_NUM_THREADS to 1, the new version becomes consistent and performs slightly better than the old version.&lt;/P&gt;&lt;P&gt;Now, the question is: how can I determine when to enable or disable OMP multithreading?&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jul 2025 15:35:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705530#M37274</guid>
      <dc:creator>Yang76</dc:creator>
      <dc:date>2025-07-25T15:35:05Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705552#M37276</link>
      <description>&lt;P&gt;You need to run any benchmarks twice and only use the timing of the second run. There is significant overhead in setting up OpenMP threading the first time.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jul 2025 17:57:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705552#M37276</guid>
      <dc:creator>AndrewC2</dc:creator>
      <dc:date>2025-07-25T17:57:39Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705567#M37277</link>
      <description>&lt;P&gt;This is exactly what I did. All the results I reported are from the second run, after inverting some random matrices.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After some experiments and with some help from chatgpt, I think I have a better understanding of the situation. There are two main factors at play here:&lt;/P&gt;&lt;P&gt;1. Matrix size. Matrix of size 500 x 500 is actually quite small to fully exploit parallelization&lt;/P&gt;&lt;P&gt;2. Default number of threads.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am on a virtual machine with 4 sockets/8 virtual processors. The new version uses 8 threads by default (as confirmed by mkl verbose mode), whereas the old version seems to use only 2 threads (inferred from the ratio of processor time to clock time). Using fewer threads actually helps when the matrices are small.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When I set the number of threads explicitly, or run the same experiment on a physical computer with 1 socket/10 cores, the new version performs consistently and is slightly better.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jul 2025 19:45:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705567#M37277</guid>
      <dc:creator>Yang76</dc:creator>
      <dc:date>2025-07-25T19:45:53Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705584#M37278</link>
      <description>&lt;P&gt;OK, well that makes sense. VM's are tricky beasts to be running benchmarks on.&amp;nbsp;&lt;BR /&gt;So the summary is that you did not find any performance regressions with the new version of MKL which agrees with my experience as well.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jul 2025 22:11:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705584#M37278</guid>
      <dc:creator>AndrewC2</dc:creator>
      <dc:date>2025-07-25T22:11:22Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705895#M37279</link>
      <description>&lt;P&gt;Hi all, this topic is related to sequential MKL, means my app is self-multithreaded and I do not use MKL OMP. I manage threads myself.&lt;/P&gt;&lt;P&gt;In this case,&amp;nbsp;disabling OMP multithreading probably&amp;nbsp;have no effect. But maybe I’m wrong.&lt;/P&gt;&lt;P&gt;I used Intel VTune to compare to old MKL performance to try to find out where performance regressions are, but seems spread on all calculations.&lt;/P&gt;&lt;P&gt;Also maybe issue with MKL_malloc, since is locking..&lt;/P&gt;&lt;P&gt;One other thing, I have hyperthreading activated, should this have that huge impact on performance.&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/250759"&gt;@Fengrui&lt;/a&gt;&amp;nbsp;do have any advice?&lt;/P&gt;&lt;P&gt;Reminder, the only change I did on my computational library is moving from MKL 2017 to MKL 2025.&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jul 2025 09:08:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1705895#M37279</guid>
      <dc:creator>hatdal</dc:creator>
      <dc:date>2025-07-28T09:08:32Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1707362#M37283</link>
      <description>&lt;P&gt;I believe mkl by default is multithreaded. For mkl 2025, you can set MKL_VERBOSE env variable to 1 and see the detailed info in the console ouput (this feature is not available in mkl 2017).&lt;/P&gt;</description>
      <pubDate>Mon, 04 Aug 2025 17:02:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1707362#M37283</guid>
      <dc:creator>Yang76</dc:creator>
      <dc:date>2025-08-04T17:02:20Z</dc:date>
    </item>
    <item>
      <title>Re: Performance drop when upgraded from MKL 2017 to MKL 2025</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1707504#M37284</link>
      <description>&lt;P&gt;I link my C++ soft with &lt;SPAN&gt;sequential MKL.. which means MKL not threaded..&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Aug 2025 08:54:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-drop-when-upgraded-from-MKL-2017-to-MKL-2025/m-p/1707504#M37284</guid>
      <dc:creator>hatdal</dc:creator>
      <dc:date>2025-08-05T08:54:52Z</dc:date>
    </item>
  </channel>
</rss>

