<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic 1D mkl FFT Multithread Use in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774718#M913</link>
    <description>&lt;P&gt;Hi Victor,&lt;BR /&gt;&lt;BR /&gt;The timing that i have cited excludes set-up times. So I guess your PC is a bit faster than mine.&lt;BR /&gt;&lt;BR /&gt;I did link with the libraries as suggested. My compile &amp;amp; link line is shown below;&lt;BR /&gt;&lt;BR /&gt;(is mkl_dfti.f90 for multi-thread use?).&lt;BR /&gt;&lt;BR /&gt;ifort -c modules.f mkl_dfti.f90&lt;BR /&gt;ifort -extend_source -nowarn -align -Qzero -QxSSE2 -Qsave -Qopenmp -MT -Qmkl -c *.f&lt;BR /&gt;ifort -MT *.obj mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib /Qopenmp&lt;BR /&gt;&lt;BR /&gt;Would it be possible to show me how you linked your code?&lt;/P&gt;&lt;P&gt;My code is written in FOTRAN. &lt;BR /&gt;&lt;BR /&gt;I wonder if there is an issue in the way i am calling the MKL FFTs.&lt;BR /&gt;&lt;BR /&gt;Is there any specific way of setting up the mkl calls?&lt;/P&gt;&lt;P&gt;I call and time the forward and backward FFT with;&lt;/P&gt;&lt;P&gt;etime_in1 = etime(rtm)&lt;/P&gt;&lt;P&gt;call zzfft(-1,n,1.d0,mctime,mctime,Table,Wsave,ISYS)&lt;/P&gt;&lt;P&gt;call zzfft(1,n,1.d0,mctime,mctime,Table,Wsave,ISYS)&lt;BR /&gt;&lt;BR /&gt;etime_out1 = etime(rtm)&lt;/P&gt;&lt;P&gt;e.g. Isetup the 1D FFT with the following&lt;/P&gt;&lt;P&gt;if(ndir.eq.0)then&lt;BR /&gt;status = DftiFreeDescriptor(Desc_Handle)&lt;BR /&gt;status = dfticreatedescriptor(desc_handle, 36, 32, 1, n)&lt;BR /&gt;status = dfticommitdescriptor(desc_handle)&lt;BR /&gt;end if&lt;/P&gt;&lt;P&gt;c the forward FFT us calledwith Exy is complex*16&lt;BR /&gt;&lt;BR /&gt;if(ndir.eq.-1)then&lt;BR /&gt;status = DftiComputeForward(Desc_Handle,Exy)&lt;BR /&gt;end if&lt;/P&gt;&lt;P&gt;if(ndir.eq.1)then&lt;BR /&gt;status = DftiComputeBackward(Desc_Handle,Exy)&lt;BR /&gt;end if&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Wed, 30 Mar 2011 05:37:34 GMT</pubDate>
    <dc:creator>dfishman</dc:creator>
    <dc:date>2011-03-30T05:37:34Z</dc:date>
    <item>
      <title>1D mkl FFT Multithread Use</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774716#M911</link>
      <description>&lt;P&gt;Hi, March 28, 2011&lt;BR /&gt;&lt;BR /&gt;I want to use the 1D mkl (w_mkl_10.3.2.154 w_ccompxe_2011.2.154) FFT in a multi-threaded application. I noticed that the FFT does not run as multithread.&lt;BR /&gt;&lt;BR /&gt;e.g. I am running timing tests with 2^20 FFT and i found that 2^20 takes about 28 milliseconds for a forward or backward FFT.&lt;BR /&gt;&lt;BR /&gt;I get this timing value for 1 CPU or for 8 CPU.&lt;BR /&gt;&lt;BR /&gt;Does anyone have experience with 1D FFTs and can they share their FFT code with me; perhaps I am not calling the primitives correctly.&lt;BR /&gt;&lt;BR /&gt;e.g. my calling is described below, wheren = 2^20, and Exy is the complex doubleprecision array.&lt;/P&gt;&lt;P&gt;type(DFTI_descriptor), pointer :: desc_handle&lt;/P&gt;&lt;P&gt;integer :: status&lt;/P&gt;&lt;P&gt;complex*16 Exy(N_Bitpnt),Exy2(N_Bitpnt)&lt;/P&gt;&lt;P&gt;status = DftiFreeDescriptor(Desc_Handle)&lt;BR /&gt;status = dfticreatedescriptor(desc_handle, 36, 32, 1, n)&lt;BR /&gt;status = dfticommitdescriptor(desc_handle)&lt;BR /&gt;status = DftiComputeForward(Desc_Handle,Exy)&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2011 15:44:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774716#M911</guid>
      <dc:creator>dfishman</dc:creator>
      <dc:date>2011-03-28T15:44:18Z</dc:date>
    </item>
    <item>
      <title>1D mkl FFT Multithread Use</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774717#M912</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;Did you link your test with intel threading layer together with OpenMP library?&lt;BR /&gt;Please check your linking line with &lt;A href="http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/"&gt;http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;In case of out-of-place double precision 2^20 I can see on my machine &lt;BR /&gt;the following performance using MKL 10.3.3 (intel64):&lt;BR /&gt;&lt;BR /&gt;- with env MKL_NUM_THREADS=1 (or OMP_NUM_THREADS=1)&lt;BR /&gt; Problem: 1048576, setup: 13.15 ms, time: 21.97 ms, ``gflops'': 4.7736&lt;BR /&gt;&lt;BR /&gt;- with env MKL_NUM_THREADS=8 (or OMP_NUM_THREADS=8)&lt;BR /&gt; Problem: 1048576, setup: 46.12 ms, time: 11.44 ms, ``gflops'': 9.1675&lt;BR /&gt;&lt;BR /&gt;For in-place double precision 2^20 I can see the following performance:&lt;BR /&gt;&lt;BR /&gt;- with env MKL_NUM_THREADS=1 (or OMP_NUM_THREADS=1)&lt;BR /&gt; Problem: i1048576, setup: 10.60 ms, time: 20.71 ms, ``gflops'': 5.0619&lt;BR /&gt;&lt;BR /&gt;- with env MKL_NUM_THREADS=8 (or OMP_NUM_THREADS=8)&lt;BR /&gt; Problem: i1048576, setup: 347.00 us, time: 6.64 ms, ``gflops'': 15.788</description>
      <pubDate>Tue, 29 Mar 2011 05:50:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774717#M912</guid>
      <dc:creator>barragan_villanueva_</dc:creator>
      <dc:date>2011-03-29T05:50:44Z</dc:date>
    </item>
    <item>
      <title>1D mkl FFT Multithread Use</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774718#M913</link>
      <description>&lt;P&gt;Hi Victor,&lt;BR /&gt;&lt;BR /&gt;The timing that i have cited excludes set-up times. So I guess your PC is a bit faster than mine.&lt;BR /&gt;&lt;BR /&gt;I did link with the libraries as suggested. My compile &amp;amp; link line is shown below;&lt;BR /&gt;&lt;BR /&gt;(is mkl_dfti.f90 for multi-thread use?).&lt;BR /&gt;&lt;BR /&gt;ifort -c modules.f mkl_dfti.f90&lt;BR /&gt;ifort -extend_source -nowarn -align -Qzero -QxSSE2 -Qsave -Qopenmp -MT -Qmkl -c *.f&lt;BR /&gt;ifort -MT *.obj mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib /Qopenmp&lt;BR /&gt;&lt;BR /&gt;Would it be possible to show me how you linked your code?&lt;/P&gt;&lt;P&gt;My code is written in FOTRAN. &lt;BR /&gt;&lt;BR /&gt;I wonder if there is an issue in the way i am calling the MKL FFTs.&lt;BR /&gt;&lt;BR /&gt;Is there any specific way of setting up the mkl calls?&lt;/P&gt;&lt;P&gt;I call and time the forward and backward FFT with;&lt;/P&gt;&lt;P&gt;etime_in1 = etime(rtm)&lt;/P&gt;&lt;P&gt;call zzfft(-1,n,1.d0,mctime,mctime,Table,Wsave,ISYS)&lt;/P&gt;&lt;P&gt;call zzfft(1,n,1.d0,mctime,mctime,Table,Wsave,ISYS)&lt;BR /&gt;&lt;BR /&gt;etime_out1 = etime(rtm)&lt;/P&gt;&lt;P&gt;e.g. Isetup the 1D FFT with the following&lt;/P&gt;&lt;P&gt;if(ndir.eq.0)then&lt;BR /&gt;status = DftiFreeDescriptor(Desc_Handle)&lt;BR /&gt;status = dfticreatedescriptor(desc_handle, 36, 32, 1, n)&lt;BR /&gt;status = dfticommitdescriptor(desc_handle)&lt;BR /&gt;end if&lt;/P&gt;&lt;P&gt;c the forward FFT us calledwith Exy is complex*16&lt;BR /&gt;&lt;BR /&gt;if(ndir.eq.-1)then&lt;BR /&gt;status = DftiComputeForward(Desc_Handle,Exy)&lt;BR /&gt;end if&lt;/P&gt;&lt;P&gt;if(ndir.eq.1)then&lt;BR /&gt;status = DftiComputeBackward(Desc_Handle,Exy)&lt;BR /&gt;end if&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 30 Mar 2011 05:37:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774718#M913</guid>
      <dc:creator>dfishman</dc:creator>
      <dc:date>2011-03-30T05:37:34Z</dc:date>
    </item>
    <item>
      <title>1D mkl FFT Multithread Use</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774719#M914</link>
      <description>Hi Victor,&lt;BR /&gt;&lt;BR /&gt;PLease note that i am running ia32 on a Windows XP x64 OS.</description>
      <pubDate>Thu, 31 Mar 2011 04:48:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774719#M914</guid>
      <dc:creator>dfishman</dc:creator>
      <dc:date>2011-03-31T04:48:23Z</dc:date>
    </item>
    <item>
      <title>1D mkl FFT Multithread Use</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774720#M915</link>
      <description>Hi Victor,&lt;BR /&gt;&lt;BR /&gt;I think I found the problem. I wasn't initializing the FFT while setting MKL_NUM_THREADS to the # of CPU; i.e. i was always setting MKL_NUM_THREADS=1 for the initialization step.&lt;BR /&gt;&lt;BR /&gt;Even thoughI was settingMKL_NUM_THREADS &amp;gt; 1 for the actual FFT forward or backward operation.&lt;BR /&gt;&lt;BR /&gt;However, the FFT speed up is only 2x for 2^18 FFT and only 33% for 2^20 while you are showing a 300% improvement for 2^20.&lt;BR /&gt;&lt;BR /&gt;Do you know why this might happen? Is it CPU or cache dependent?&lt;BR /&gt;&lt;BR /&gt;I am using ia32 machine with 2 Qaud 5590 3.3 GHz CPUs. The L2 cache in my machine is 12 MB.&lt;BR /&gt;&lt;BR /&gt;Thanks.</description>
      <pubDate>Wed, 06 Apr 2011 23:51:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/1D-mkl-FFT-Multithread-Use/m-p/774720#M915</guid>
      <dc:creator>dfishman</dc:creator>
      <dc:date>2011-04-06T23:51:33Z</dc:date>
    </item>
  </channel>
</rss>

