<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to configure intel MKL FFT for best performance? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1500235#M34716</link>
    <description>&lt;P&gt;Hi Harsh,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for helping us improve our products! We’ve submitted the feature request to the dev team, they will consider it based on multiple factors including, but not limited to priority and criticality of the feature. Once it is included in an upcoming release, it would be documented in the &lt;A href="https://www.intel.com/content/www/us/en/developer/articles/release-notes/onemkl-release-notes.html" target="_self"&gt;release notes&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks and Regards,&lt;/P&gt;
&lt;P&gt;Praneeth Achanta&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 29 Jun 2023 05:21:35 GMT</pubDate>
    <dc:creator>PraneethA_Intel</dc:creator>
    <dc:date>2023-06-29T05:21:35Z</dc:date>
    <item>
      <title>How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1480305#M34517</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am trying to perform 2D FFT on an Intel i5-6500 &lt;A href="mailto:CPU@3.20" target="_blank"&gt;CPU@3.20&lt;/A&gt;&amp;nbsp;GHz.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have coded a program to benchmark FFTW and Intel MKL, surprisingly, the FFTW outperformed the Intel FFTW library&amp;nbsp; by 3 times, that is the FFTW time is 3 times less than Intel MKL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here are the results:&lt;/P&gt;
&lt;PRE&gt;2023-04-26T11:01:31+05:30&lt;BR /&gt;Running FFTWCompare.exe&lt;BR /&gt;Run on (4 X 3192 MHz CPU s)&lt;BR /&gt;CPU Caches:&lt;BR /&gt;L1 Data 32 KiB (x4)&lt;BR /&gt;L1 Instruction 32 KiB (x4)&lt;BR /&gt;L2 Unified 256 KiB (x4)&lt;BR /&gt;L3 Unified 6144 KiB (x1)&lt;BR /&gt;------------------------------------------------------&lt;BR /&gt;Benchmark Time CPU Iterations&lt;BR /&gt;------------------------------------------------------&lt;BR /&gt;BM_IntelFFT 915017 ns 920348 ns 747&lt;BR /&gt;BM_FFTW 390155 ns 392369 ns 1792&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The CMake File:&lt;/P&gt;
&lt;PRE&gt;&lt;SPAN&gt;cmake_minimum_required&lt;/SPAN&gt;(&lt;SPAN&gt;VERSION &lt;/SPAN&gt;3.25)&lt;BR /&gt;&lt;SPAN&gt;project&lt;/SPAN&gt;(Benchmarking)&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(CMAKE_CXX_STANDARD 23)&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(CMAKE_BUILD_TYPE RELEASE)&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;find_package&lt;/SPAN&gt;(benchmark &lt;SPAN&gt;REQUIRED&lt;/SPAN&gt;)&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(CMAKE_C_FLAGS "-Ofast -march=native")&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(CMAKE_CXX_FLAGS "-Ofast -march=native")&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(FFTW C:/Users/IIAP-IPC/Documents/fftw-3)&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(FFTWLINK &lt;SPAN&gt;${&lt;/SPAN&gt;FFTW&lt;SPAN&gt;}&lt;/SPAN&gt;/libfftw3l-3.lib &lt;SPAN&gt;${&lt;/SPAN&gt;FFTW&lt;SPAN&gt;}&lt;/SPAN&gt;/libfftw3f-3.lib &lt;SPAN&gt;${&lt;/SPAN&gt;FFTW&lt;SPAN&gt;}&lt;/SPAN&gt;/libfftw3-3.lib)&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(INTEL_FFTWLINK C:/PROGRA~2/Intel/oneAPI/mkl/2023.1.0/lib/intel64)&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(INTEL_FFTW_INCLUDE C:/PROGRA~2/Intel/oneAPI/mkl/2023.1.0/include)&lt;BR /&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;(INTEL_FFTWLINKLIB &lt;SPAN&gt;${&lt;/SPAN&gt;INTEL_FFTWLINK&lt;SPAN&gt;}&lt;/SPAN&gt;/mkl_intel_lp64.lib &lt;SPAN&gt;${&lt;/SPAN&gt;INTEL_FFTWLINK&lt;SPAN&gt;}&lt;/SPAN&gt;/mkl_core.lib &lt;SPAN&gt;${&lt;/SPAN&gt;INTEL_FFTWLINK&lt;SPAN&gt;}&lt;/SPAN&gt;/mkl_sequential.lib)&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;add_executable&lt;/SPAN&gt;(FFTWCompare main.cpp)&lt;BR /&gt;&lt;SPAN&gt;target_include_directories&lt;/SPAN&gt;(FFTWCompare &lt;SPAN&gt;PRIVATE ${&lt;/SPAN&gt;INTEL_FFTW_INCLUDE&lt;SPAN&gt;} ${&lt;/SPAN&gt;FFTW&lt;SPAN&gt;}&lt;/SPAN&gt;)&lt;BR /&gt;&lt;SPAN&gt;target_link_libraries&lt;/SPAN&gt;(FFTWCompare &lt;SPAN&gt;PRIVATE ${&lt;/SPAN&gt;INTEL_FFTWLINKLIB&lt;SPAN&gt;} ${&lt;/SPAN&gt;FFTWLINK&lt;SPAN&gt;} &lt;/SPAN&gt;benchmark::benchmark)&lt;BR /&gt;&lt;BR /&gt;&lt;/PRE&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;The C++ Code:&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;PRE&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;iostream&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;fstream&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;mkl.h&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;complex&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;chrono&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;fftw3.h&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#include &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;benchmark/benchmark.h&amp;gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#define &lt;/SPAN&gt;&lt;SPAN&gt;NN &lt;/SPAN&gt;&lt;SPAN&gt;256&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;#define &lt;/SPAN&gt;&lt;SPAN&gt;NPIXFFT NN &lt;/SPAN&gt;* (&lt;SPAN&gt;1 &lt;/SPAN&gt;+ (&lt;SPAN&gt;NN &lt;/SPAN&gt;/ &lt;SPAN&gt;2&lt;/SPAN&gt;))&lt;BR /&gt;&lt;SPAN&gt;using namespace &lt;/SPAN&gt;&lt;SPAN&gt;std&lt;/SPAN&gt;;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;typedef struct &lt;/SPAN&gt;{&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;&lt;SPAN&gt;re&lt;/SPAN&gt;;&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;&lt;SPAN&gt;im&lt;/SPAN&gt;;&lt;BR /&gt;} &lt;SPAN&gt;mkl_double_complex&lt;/SPAN&gt;;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;fftw_plan &lt;/SPAN&gt;PlanForward;&lt;BR /&gt;&lt;SPAN&gt;fftw_plan &lt;/SPAN&gt;PlanInverse;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;int &lt;/SPAN&gt;&lt;SPAN&gt;getIntelFFTWPlans&lt;/SPAN&gt;(&lt;SPAN&gt;DFTI_DESCRIPTOR_HANDLE &lt;/SPAN&gt;*descHandle);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;static void &lt;/SPAN&gt;&lt;SPAN&gt;BM_FFTW&lt;/SPAN&gt;(&lt;SPAN&gt;benchmark&lt;/SPAN&gt;::&lt;SPAN&gt;State&lt;/SPAN&gt;&amp;amp; state) {&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;*forIN;&lt;BR /&gt;forIN = &lt;SPAN&gt;new double&lt;/SPAN&gt;[&lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;];&lt;BR /&gt;&lt;SPAN&gt;fftw_complex &lt;/SPAN&gt;*forOUT;&lt;BR /&gt;forOUT = (&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;*) fftw_malloc(&lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;) * &lt;SPAN&gt;NPIXFFT&lt;/SPAN&gt;);&lt;BR /&gt;PlanForward = fftw_plan_dft_r2c_2d(&lt;SPAN&gt;NN&lt;/SPAN&gt;, &lt;SPAN&gt;NN&lt;/SPAN&gt;, forIN, forOUT, &lt;SPAN&gt;FFTW_MEASURE&lt;/SPAN&gt;);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;// Inverse plan&lt;BR /&gt;&lt;/SPAN&gt; &lt;SPAN&gt;fftw_complex &lt;/SPAN&gt;*invIN;&lt;BR /&gt;invIN = (&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;*) fftw_malloc(&lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;) * &lt;SPAN&gt;NPIXFFT&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;*invOUT;&lt;BR /&gt;invOUT = &lt;SPAN&gt;new double&lt;/SPAN&gt;[&lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;];&lt;BR /&gt;PlanInverse = fftw_plan_dft_c2r_2d(&lt;SPAN&gt;NN&lt;/SPAN&gt;, &lt;SPAN&gt;NN&lt;/SPAN&gt;, invIN, invOUT, &lt;SPAN&gt;FFTW_MEASURE&lt;/SPAN&gt;);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;*image = (&lt;SPAN&gt;double&lt;/SPAN&gt;*)malloc(&lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;double&lt;/SPAN&gt;) * &lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;*recoveredImage = (&lt;SPAN&gt;double&lt;/SPAN&gt;*)malloc(&lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;double&lt;/SPAN&gt;) * &lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;fftw_complex &lt;/SPAN&gt;*imageFT = (&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;*) &lt;SPAN&gt;mkl_malloc&lt;/SPAN&gt;(&lt;BR /&gt;&lt;SPAN&gt;NPIXFFT &lt;/SPAN&gt;* &lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;),&lt;SPAN&gt;64&lt;/SPAN&gt;);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;for &lt;/SPAN&gt;(&lt;SPAN&gt;unsigned int &lt;/SPAN&gt;i=&lt;SPAN&gt;0&lt;/SPAN&gt;; i&amp;lt;&lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;; i++){&lt;BR /&gt;image[i] = i * i + i * &lt;SPAN&gt;2 &lt;/SPAN&gt;+ &lt;SPAN&gt;1&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;for &lt;/SPAN&gt;(&lt;SPAN&gt;auto &lt;/SPAN&gt;_ : state) {&lt;BR /&gt;&lt;SPAN&gt;// This code gets timed&lt;BR /&gt;&lt;/SPAN&gt; fftw_execute_dft_r2c( PlanForward, image, imageFT);&lt;BR /&gt;fftw_execute_dft_c2r( PlanInverse, imageFT, recoveredImage);&lt;BR /&gt;&lt;SPAN&gt;benchmark&lt;/SPAN&gt;::DoNotOptimize(recoveredImage);&lt;BR /&gt;&lt;SPAN&gt;benchmark&lt;/SPAN&gt;::ClobberMemory();&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;static void &lt;/SPAN&gt;&lt;SPAN&gt;BM_IntelFFT&lt;/SPAN&gt;(&lt;SPAN&gt;benchmark&lt;/SPAN&gt;::&lt;SPAN&gt;State&lt;/SPAN&gt;&amp;amp; state) {&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;DFTI_DESCRIPTOR_HANDLE &lt;/SPAN&gt;descHandle;&lt;BR /&gt;getIntelFFTWPlans(&amp;amp;descHandle);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;*image = (&lt;SPAN&gt;double&lt;/SPAN&gt;*)malloc(&lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;double&lt;/SPAN&gt;) * &lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;double &lt;/SPAN&gt;*recoveredImage = (&lt;SPAN&gt;double&lt;/SPAN&gt;*)malloc(&lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;double&lt;/SPAN&gt;) * &lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;fftw_complex &lt;/SPAN&gt;*imageFT = (&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;*) &lt;SPAN&gt;mkl_malloc&lt;/SPAN&gt;(&lt;BR /&gt;&lt;SPAN&gt;NPIXFFT &lt;/SPAN&gt;* &lt;SPAN&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN&gt;fftw_complex&lt;/SPAN&gt;),&lt;SPAN&gt;64&lt;/SPAN&gt;);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;for &lt;/SPAN&gt;(&lt;SPAN&gt;unsigned int &lt;/SPAN&gt;i=&lt;SPAN&gt;0&lt;/SPAN&gt;; i&amp;lt;&lt;SPAN&gt;NN &lt;/SPAN&gt;* &lt;SPAN&gt;NN&lt;/SPAN&gt;; i++){&lt;BR /&gt;image[i] = i * i + i * &lt;SPAN&gt;2 &lt;/SPAN&gt;+ &lt;SPAN&gt;1&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;for &lt;/SPAN&gt;(&lt;SPAN&gt;auto &lt;/SPAN&gt;_ : state) {&lt;BR /&gt;&lt;SPAN&gt;// This code gets timed&lt;BR /&gt;&lt;/SPAN&gt; DftiComputeForward(descHandle, image, imageFT);&lt;BR /&gt;DftiComputeBackward(descHandle, imageFT, recoveredImage);&lt;BR /&gt;&lt;SPAN&gt;benchmark&lt;/SPAN&gt;::DoNotOptimize(recoveredImage);&lt;BR /&gt;&lt;SPAN&gt;benchmark&lt;/SPAN&gt;::ClobberMemory();&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;&lt;SPAN&gt;// Register the function as a benchmark&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;BENCHMARK&lt;/SPAN&gt;(BM_IntelFFT);&lt;BR /&gt;&lt;SPAN&gt;BENCHMARK&lt;/SPAN&gt;(BM_FFTW);&lt;BR /&gt;&lt;SPAN&gt;// Run the benchmark&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;BENCHMARK_MAIN&lt;/SPAN&gt;();&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;int &lt;/SPAN&gt;&lt;SPAN&gt;getIntelFFTWPlans&lt;/SPAN&gt;(&lt;SPAN&gt;DFTI_DESCRIPTOR_HANDLE &lt;/SPAN&gt;*descHandle){&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;MKL_LONG &lt;/SPAN&gt;lengths[&lt;SPAN&gt;2&lt;/SPAN&gt;];&lt;BR /&gt;lengths[&lt;SPAN&gt;0&lt;/SPAN&gt;] = &lt;SPAN&gt;NN&lt;/SPAN&gt;;&lt;BR /&gt;lengths[&lt;SPAN&gt;1&lt;/SPAN&gt;] = &lt;SPAN&gt;NN&lt;/SPAN&gt;;&lt;BR /&gt;&lt;SPAN&gt;MKL_LONG &lt;/SPAN&gt;status = &lt;SPAN&gt;DftiCreateDescriptor&lt;/SPAN&gt;(descHandle, &lt;SPAN&gt;DFTI_DOUBLE&lt;/SPAN&gt;, &lt;SPAN&gt;DFTI_REAL&lt;/SPAN&gt;, &lt;SPAN&gt;2&lt;/SPAN&gt;, lengths);&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiCreateDescriptor failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;1&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;status = DftiSetValue(*descHandle, &lt;SPAN&gt;DFTI_PLACEMENT&lt;/SPAN&gt;, &lt;SPAN&gt;DFTI_NOT_INPLACE&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiSetValue DFTI_PLACEMENT failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;2&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;status = DftiSetValue(*descHandle, &lt;SPAN&gt;DFTI_THREAD_LIMIT&lt;/SPAN&gt;, &lt;SPAN&gt;1&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiSetValue DFTI_THREAD_LIMIT failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;3&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;status = DftiSetValue(*descHandle, &lt;SPAN&gt;DFTI_CONJUGATE_EVEN_STORAGE&lt;/SPAN&gt;, &lt;SPAN&gt;DFTI_COMPLEX_COMPLEX&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiSetValue DFTI_CONJUGATE_EVEN_STORAGE failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;4&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;status = DftiSetValue(*descHandle, &lt;SPAN&gt;DFTI_PACKED_FORMAT&lt;/SPAN&gt;, &lt;SPAN&gt;DFTI_CCE_FORMAT&lt;/SPAN&gt;);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiSetValue DFTI_PACKED_FORMAT failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;5&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;MKL_LONG &lt;/SPAN&gt;strides[&lt;SPAN&gt;3&lt;/SPAN&gt;];&lt;BR /&gt;strides[&lt;SPAN&gt;0&lt;/SPAN&gt;] = &lt;SPAN&gt;0&lt;/SPAN&gt;;&lt;BR /&gt;strides[&lt;SPAN&gt;1&lt;/SPAN&gt;] = &lt;SPAN&gt;1&lt;/SPAN&gt;;&lt;BR /&gt;strides[&lt;SPAN&gt;2&lt;/SPAN&gt;] = &lt;SPAN&gt;NN&lt;/SPAN&gt;;&lt;BR /&gt;&lt;BR /&gt;status = DftiSetValue(*descHandle, &lt;SPAN&gt;DFTI_INPUT_STRIDES&lt;/SPAN&gt;, strides);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiSetValue DFTI_INPUT_STRIDES failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;6&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;status = DftiSetValue(*descHandle, &lt;SPAN&gt;DFTI_OUTPUT_STRIDES&lt;/SPAN&gt;, strides);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiSetValue DFTI_OUTPUT_STRIDES failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;7&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;MKL_LONG &lt;/SPAN&gt;format;&lt;BR /&gt;status = DftiGetValue(*descHandle, &lt;SPAN&gt;DFTI_PACKED_FORMAT&lt;/SPAN&gt;, &amp;amp;format);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiGetValue DFTI_PACKED_FORMAT failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;8&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;SPAN&gt;// cout &amp;lt;&amp;lt; "DftiGetValue DFTI_PACKED_FORMAT : " &amp;lt;&amp;lt; format &amp;lt;&amp;lt; endl;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt; status = DftiCommitDescriptor(*descHandle);&lt;BR /&gt;&lt;SPAN&gt;if &lt;/SPAN&gt;(status != &lt;SPAN&gt;0&lt;/SPAN&gt;) {&lt;BR /&gt;cout &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN&gt;"DftiCommitDescriptor failed : " &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;status &lt;SPAN&gt;&amp;lt;&amp;lt; &lt;/SPAN&gt;endl;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;-&lt;SPAN&gt;9&lt;/SPAN&gt;;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;return &lt;/SPAN&gt;status;&lt;BR /&gt;}&lt;/PRE&gt;
&lt;/DIV&gt;</description>
      <pubDate>Wed, 26 Apr 2023 05:34:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1480305#M34517</guid>
      <dc:creator>HarshM</dc:creator>
      <dc:date>2023-04-26T05:34:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1482234#M34548</link>
      <description>&lt;P style="text-align: justify;"&gt;Hi Harsh,&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;Thanks for posting in Intel communities.&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;We have tried running your code with the given cmakelists file after installing Google Benchmark and fftw3, but we are encountering errors (Please see the log file attached).&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;Could you please let us know what we are missing here?&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;Thanks and Regards,&lt;/P&gt;
&lt;P style="text-align: justify;"&gt;Praneeth Achanta&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 02 May 2023 10:48:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1482234#M34548</guid>
      <dc:creator>PraneethA_Intel</dc:creator>
      <dc:date>2023-05-02T10:48:58Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1482239#M34549</link>
      <description>The error says that if you are building RELEASE version of benchmark, then you must do CMAKE_BUILD_TYPE as RELEASE. If you want to compile in Debug mode, build the google benchmark in debug mode. Just replace the release with debug in the cmake commands while building</description>
      <pubDate>Tue, 02 May 2023 11:25:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1482239#M34549</guid>
      <dc:creator>HarshM</dc:creator>
      <dc:date>2023-05-02T11:25:44Z</dc:date>
    </item>
    <item>
      <title>Re:How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1484478#M34565</link>
      <description>&lt;P&gt;Hi Harsh,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for sharing the information. We were able to observe a similar issue at our end as well. We are looking into your issue internally and will get back to you soon with an update.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks and Regards,&lt;/P&gt;&lt;P&gt;Praneeth Achanta&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 09 May 2023 12:08:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1484478#M34565</guid>
      <dc:creator>PraneethA_Intel</dc:creator>
      <dc:date>2023-05-09T12:08:29Z</dc:date>
    </item>
    <item>
      <title>Re: Re:How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1484480#M34566</link>
      <description>I am glad to know you are able to run this example.&lt;BR /&gt;&lt;BR /&gt;I am looking forward to your response.&lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;Harsh</description>
      <pubDate>Tue, 09 May 2023 12:16:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1484480#M34566</guid>
      <dc:creator>HarshM</dc:creator>
      <dc:date>2023-05-09T12:16:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1486943#M34592</link>
      <description>&lt;P&gt;after updating strides as the following example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unable-to-perform-2D-FFT-with-NX-NY-128/m-p/1478771/emcs_t/S2h8ZW1haWx8dG9waWNfc3Vic2NyaXB0aW9ufExHUERWVEs1OFZPREY3fDE0Nzg3NzF8U1VCU0NSSVBUSU9OU3xoSw#M7988" target="_blank"&gt;https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unable-to-perform-2D-FFT-with-NX-NY-128/m-p/1478771/emcs_t/S2h8ZW1haWx8dG9waWNfc3Vic2NyaXB0aW9ufExHUERWVEs1OFZPREY3fDE0Nzg3NzF8U1VCU0NSSVBUSU9OU3xoSw#M7988&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;// Input Strides&lt;BR /&gt;strides[0] = 0;&lt;BR /&gt;strides[1] = 1;&lt;BR /&gt;strides[2] = NN;&lt;BR /&gt;&lt;BR /&gt;//Output Strides&lt;BR /&gt;strides[0] = 0;&lt;BR /&gt;strides[1] = 1;&lt;BR /&gt;strides[2] = 1 + (NN / 2);&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The Bechmark slightly improved, now Intel MKL FFT takes 0.6 ms and FFTW as usual takes 0.34 ms.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;FFTW still beats Intel MKL by 2 times.&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2023 09:04:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1486943#M34592</guid>
      <dc:creator>HarshM</dc:creator>
      <dc:date>2023-05-17T09:04:52Z</dc:date>
    </item>
    <item>
      <title>Re:How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1487727#M34595</link>
      <description>&lt;P&gt;Hi Harsh,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for sharing the information. As informed earlier, we are looking into your issue internally and will get back to you soon with an update. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks and Regards,&lt;/P&gt;&lt;P&gt;Praneeth Achanta&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 19 May 2023 07:38:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1487727#M34595</guid>
      <dc:creator>PraneethA_Intel</dc:creator>
      <dc:date>2023-05-19T07:38:45Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1487812#M34597</link>
      <description>&lt;P&gt;Also I would request you along with the solution a basic program example which does forward and backward FFT of a 2D image stored in row major order, similar to I have given.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The major issue I believe with the Intel MKL is, there is tons of intertia with initiation. Though there is tons of information and documentation, not many examples are provided for common use cases, instead 4 liner snippets are present everywhere. The examples given in the examples folder also are too convoluted to understand who is not so familiar with programming. For example, in FFTW, we just create a plan, execute a function, no business of setting strides or any other useless options, its straight forward.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sorry my comment is a bit rude, but I think an API which is easy to grasp always charms programmers.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Harsh&lt;/P&gt;</description>
      <pubDate>Fri, 19 May 2023 12:53:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1487812#M34597</guid>
      <dc:creator>HarshM</dc:creator>
      <dc:date>2023-05-19T12:53:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1490385#M34608</link>
      <description>&lt;P&gt;Hi Harsh,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We found the example "basic_dp_complex_dft_2d.c" in the Intel MKL examples folder "C:\Program Files (x86)\Intel\oneAPI\mkl\2023.1.0\examples\examples_core_c.zip\c\dft\source". Could you please let us know if this example meets your requirement?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&amp;gt;&amp;gt;The examples given in the examples folder also are too convoluted to understand who is not so familiar with programming&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Thankyou for the input, we have passed on your feedback regarding sample quality to the dev team.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks and Regards,&lt;/P&gt;
&lt;P&gt;Praneeth Achanta&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2023 15:31:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1490385#M34608</guid>
      <dc:creator>PraneethA_Intel</dc:creator>
      <dc:date>2023-05-26T15:31:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure intel MKL FFT for best performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1500235#M34716</link>
      <description>&lt;P&gt;Hi Harsh,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for helping us improve our products! We’ve submitted the feature request to the dev team, they will consider it based on multiple factors including, but not limited to priority and criticality of the feature. Once it is included in an upcoming release, it would be documented in the &lt;A href="https://www.intel.com/content/www/us/en/developer/articles/release-notes/onemkl-release-notes.html" target="_self"&gt;release notes&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks and Regards,&lt;/P&gt;
&lt;P&gt;Praneeth Achanta&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jun 2023 05:21:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-configure-intel-MKL-FFT-for-best-performance/m-p/1500235#M34716</guid>
      <dc:creator>PraneethA_Intel</dc:creator>
      <dc:date>2023-06-29T05:21:35Z</dc:date>
    </item>
  </channel>
</rss>

