<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:MKL's FFT runs slower on modern AWS Xeon instance than 12 year old i5-2500k in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1543234#M35431</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for posting in Intel Communities.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please provide us the performance statistics that were compared between the mentioned hardware? We would like to request you for a sample reproducer to check the behavior at our end. Thank you.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Jilani&lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Mon, 13 Nov 2023 10:35:42 GMT</pubDate>
    <dc:creator>JilaniS_Intel</dc:creator>
    <dc:date>2023-11-13T10:35:42Z</dc:date>
    <item>
      <title>MKL's FFT runs slower on modern AWS Xeon instance than 12 year old i5-2500k</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1542630#M35425</link>
      <description>&lt;P&gt;I have a simple program that does DCT-II and real to complex FFT transforms on video inputs.&amp;nbsp;&lt;/P&gt;&lt;P&gt;It is currently linked single-threaded, as I found it ran slower with OpenMP FFT multi-threading than without, and the severe restrictions on multi-threading for FFT added unforeseen complications. It was also impossible to multi-thread the DCT-II transforms (via FFTW3 wrapper).&amp;nbsp;&lt;/P&gt;&lt;P&gt;The transform lengths correspond to typical video widths and heights, for example 1920 and 1080.&lt;/P&gt;&lt;P&gt;Currently the program runs twice slower on these modern AWS instances with AVX-512 than my Macbook M2 Pro in Rosetta emulation mode with SSE4.2, and 20% slower than on my venerable 12 year old Intel i5-2500k processor with AVX.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone have advice they can offer on this situation?&lt;/P&gt;&lt;P&gt;It must be partially caused by the clock penalty in these multi-core Xeon processors, and I guess the next step would be to add multi-threading in my own program in the layer above the transforms.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Nov 2023 08:56:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1542630#M35425</guid>
      <dc:creator>klillevold</dc:creator>
      <dc:date>2023-11-10T08:56:06Z</dc:date>
    </item>
    <item>
      <title>Re:MKL's FFT runs slower on modern AWS Xeon instance than 12 year old i5-2500k</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1543234#M35431</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for posting in Intel Communities.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please provide us the performance statistics that were compared between the mentioned hardware? We would like to request you for a sample reproducer to check the behavior at our end. Thank you.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Jilani&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 13 Nov 2023 10:35:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1543234#M35431</guid>
      <dc:creator>JilaniS_Intel</dc:creator>
      <dc:date>2023-11-13T10:35:42Z</dc:date>
    </item>
    <item>
      <title>Re: MKL's FFT runs slower on modern AWS Xeon instance than 12 year old i5-2500k</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1544216#M35448</link>
      <description>&lt;P&gt;Thank you for your reply. I am working on multi-threading my application in the layer above Intel MKL functions.&lt;/P&gt;&lt;P&gt;After further consideration, I think the performance numbers are as can be expected, albeit surprising to begin with. MKL functions run super fast and appear incredibly well-optimized.&amp;nbsp;&lt;/P&gt;&lt;P&gt;My old and still running strong i5-2500k runs at 4.5GHz, while the AWS instances run at a much lower clock rate, and single-threaded performance will therefore suffer a significant penalty.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have one question:&lt;/P&gt;&lt;P&gt;When MKL reports "Intel(R) architecture processors" for AMD processors, which instruction set is being used? The message is the same on AWS (AMD Epyc) and my personal AMD Ryzen 5.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The application will be running on c7i.8xlarge (Intel Xeon). I will measure more carefully when multi-threading is completed.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2023 16:40:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1544216#M35448</guid>
      <dc:creator>klillevold</dc:creator>
      <dc:date>2023-11-15T16:40:17Z</dc:date>
    </item>
    <item>
      <title>Re: MKL's FFT runs slower on modern AWS Xeon instance than 12 year old i5-2500k</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1544920#M35451</link>
      <description>&lt;P&gt;I finished the multi-threading of my app. It now spawns threads in the layer above FFT (via MKL), and DCT-II (via FFTW3 wrapper), which enables it to work for non power of 2 transform sizes as well as float precision. I am seeing great threading performance up to around 8 threads, with marginal gains up to 16.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Overall the performance is great on Xeon AWS instances (as well as my personal computers, both older Intel and newer AMD processors). I am still curious which instruction set MKL decides to use under the hood for AMD (&lt;SPAN&gt;"Intel(R) architecture processors"), but it really doesn't matter. Performance is great no matter.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;This thread can be considered resolved.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 17 Nov 2023 11:36:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1544920#M35451</guid>
      <dc:creator>klillevold</dc:creator>
      <dc:date>2023-11-17T11:36:41Z</dc:date>
    </item>
    <item>
      <title>Re:MKL's FFT runs slower on modern AWS Xeon instance than 12 year old i5-2500k</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1547054#M35490</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We're glad to hear that the issue was resolved. If you have any further queries or concerns in future then please raise a new thread. We will be happy to help you. Thank you.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Have a great day. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Jilani&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 24 Nov 2023 06:04:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-s-FFT-runs-slower-on-modern-AWS-Xeon-instance-than-12-year/m-p/1547054#M35490</guid>
      <dc:creator>JilaniS_Intel</dc:creator>
      <dc:date>2023-11-24T06:04:53Z</dc:date>
    </item>
  </channel>
</rss>

