<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic complex FTT vs real FFT, which one is faster? in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783237#M1724</link>
    <description>Real FFT's should be faster. A real FFT can assume that all the input (or output) imaginary values are 0. Which enables some extra optimizations.</description>
    <pubDate>Thu, 08 Jul 2010 22:40:22 GMT</pubDate>
    <dc:creator>piet_de_weer</dc:creator>
    <dc:date>2010-07-08T22:40:22Z</dc:date>
    <item>
      <title>complex FTT vs real FFT, which one is faster?</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783236#M1723</link>
      <description>&lt;BR /&gt;Hello everyone!&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Those days, I have been playing around with the IPP FTT functions.&lt;BR /&gt;&lt;BR /&gt;I especially try to compare the performance between a complex FTT and a real FFT (1D single-precison).&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;B&gt;I have already red that complex FTT are faster than real FFT, for instance here :&lt;/B&gt;&lt;BR /&gt;&lt;BR /&gt;1. Hello, according our expert, amount of processed data in functions 
ippsFFTFwd_CToC_32fcand ippsFFTFwd_RToPerm is the same, but in the last
 function more 
calculations are needed. It is the reason why this function work slower 
for the same buffer size. Regards, Vladimir&lt;BR /&gt;&lt;BR /&gt;2. Also according to this speed benchmark, &lt;A href="http://www.fftw.org/speed/CoreDuo-3.0GHz-icc/" target="_blank"&gt;http://www.fftw.org/speed/CoreDuo-3.0GHz-icc/&lt;/A&gt; , complexFTT seems to be 30% faster as realFFT &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;B&gt;However, when i'm testing both on my core2duo 1.8ghz, real FFT are faster...&lt;BR /&gt;&lt;/B&gt;&lt;BR /&gt;-- IPP FTT 1D speed test (fft size: 256) --&lt;BR /&gt;&lt;BR /&gt;fft_1D_real: 1620.1 clocks per FFT &amp;gt; FASTER !&lt;BR /&gt;fft_1D_complex: 2261.6 clocks per FFT&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Did I misunderstand something ?&lt;BR /&gt;&lt;BR /&gt;Thanx for your help! ++Josh&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Here is my test code :&lt;BR /&gt;&lt;PRE&gt;[cpp]#include &lt;CSTDIO&gt;&lt;BR /&gt;#include &lt;CSTDLIB&gt;&lt;BR /&gt;#include &lt;CMATH&gt;&lt;BR /&gt;#include &lt;CONIO.H&gt;&lt;BR /&gt;&lt;BR /&gt;#include "ipp.h"&lt;BR /&gt;&lt;BR /&gt;#define FFT_SIZE      256&lt;BR /&gt;#define FFT_ORDER      8&lt;BR /&gt;#define LOOP_MAX      200000&lt;BR /&gt;&lt;BR /&gt;using namespace std;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;// REAL FTT&lt;BR /&gt;&lt;BR /&gt;IppStatus fft_ipp_real()&lt;BR /&gt;{     &lt;BR /&gt;    int i, loop;&lt;BR /&gt;    IppStatus status;   &lt;BR /&gt;    IppsFFTSpec_R_32f *mySpecReal;   &lt;BR /&gt;    &lt;BR /&gt;    Ipp32f *input  = ippsMalloc_32f(FFT_SIZE);&lt;BR /&gt;    Ipp32f *output = ippsMalloc_32f(FFT_SIZE + 2);&lt;BR /&gt;    &lt;BR /&gt;    for(i = 0; i &amp;lt; FFT_SIZE; i++)&lt;BR /&gt;    {&lt;BR /&gt;        input&lt;I&gt; = (float)cos(0.8 * i);   &lt;BR /&gt;    }&lt;BR /&gt;        &lt;BR /&gt;    status = ippsFFTInitAlloc_R_32f(&amp;amp;mySpecReal, FFT_ORDER, IPP_FFT_NODIV_BY_ANY, ippAlgHintFast);&lt;BR /&gt;    &lt;BR /&gt;    int bufferSize = 0;&lt;BR /&gt;    status = ippsFFTGetBufSize_R_32f(mySpecReal, &amp;amp;bufferSize);&lt;BR /&gt;    Ipp8u *myBuffer = (bufferSize &amp;gt; 0 ? ippsMalloc_8u(bufferSize) : NULL);&lt;BR /&gt;    &lt;BR /&gt;    for(loop = 0; loop &amp;lt; LOOP_MAX; loop++)&lt;BR /&gt;    {&lt;BR /&gt;        status = ippsFFTFwd_RToCCS_32f(input, output, mySpecReal, myBuffer);    &lt;BR /&gt;    }&lt;BR /&gt;        &lt;BR /&gt;    ippsFFTFree_R_32f(mySpecReal);&lt;BR /&gt;    ippsFree(myBuffer);&lt;BR /&gt;    ippsFree(input);&lt;BR /&gt;    ippsFree(output);&lt;BR /&gt;        &lt;BR /&gt;    return status;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;// COMPLEX FTT&lt;BR /&gt;&lt;BR /&gt;IppStatus fft_ipp_complex()&lt;BR /&gt;{     &lt;BR /&gt;    int i, loop;&lt;BR /&gt;    IppStatus status;&lt;BR /&gt;    IppsFFTSpec_C_32fc *mySpecComplex;   &lt;BR /&gt;    &lt;BR /&gt;    Ipp32fc *input  = ippsMalloc_32fc(FFT_SIZE);&lt;BR /&gt;    Ipp32fc *output = ippsMalloc_32fc(FFT_SIZE + 2);&lt;BR /&gt;    &lt;BR /&gt;    for(i = 0; i &amp;lt; FFT_SIZE; i++)&lt;BR /&gt;    {&lt;BR /&gt;       input&lt;I&gt;.re = (float)cos(0.8 * i);   &lt;BR /&gt;       input&lt;I&gt;.im = 0;&lt;BR /&gt;    }&lt;BR /&gt;&lt;BR /&gt;    status = ippsFFTInitAlloc_C_32fc(&amp;amp;mySpecComplex, FFT_ORDER, IPP_FFT_NODIV_BY_ANY, ippAlgHintFast);&lt;BR /&gt;    &lt;BR /&gt;    int bufferSize = 0;&lt;BR /&gt;    status = ippsFFTGetBufSize_C_32fc(mySpecComplex, &amp;amp;bufferSize);&lt;BR /&gt;    Ipp8u *myBuffer = (bufferSize &amp;gt; 0 ? ippsMalloc_8u(bufferSize) : NULL);&lt;BR /&gt;    &lt;BR /&gt;    for(loop = 0; loop &amp;lt; LOOP_MAX; loop++)&lt;BR /&gt;    {&lt;BR /&gt;       status = ippsFFTFwd_CToC_32fc(input, output, mySpecComplex, myBuffer);    &lt;BR /&gt;    }&lt;BR /&gt;        &lt;BR /&gt;    ippsFFTFree_C_32fc(mySpecComplex);&lt;BR /&gt;    ippsFree(myBuffer);&lt;BR /&gt;    ippsFree(input);&lt;BR /&gt;    ippsFree(output);&lt;BR /&gt;        &lt;BR /&gt;    return status;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;// THE MAIN&lt;BR /&gt;&lt;BR /&gt;int main(int argc, char **argv)&lt;BR /&gt;{&lt;BR /&gt;  Ipp64u startClocks;&lt;BR /&gt; &lt;BR /&gt;  ippStaticInit();&lt;BR /&gt;  printf("\n-- IPP FTT 1D speed test (fft size: %d) --\n\n", FFT_SIZE);  &lt;BR /&gt;      &lt;BR /&gt;  startClocks = ippGetCpuClocks();&lt;BR /&gt;  fft_ipp_real();&lt;BR /&gt;  printf("fft_1D_real: %.1f clocks per FFT\n", (float)(ippGetCpuClocks() - startClocks) / (float)LOOP_MAX);&lt;BR /&gt; &lt;BR /&gt;  startClocks = ippGetCpuClocks();&lt;BR /&gt;  fft_ipp_complex();&lt;BR /&gt;  printf("fft_1D_complex: %.1f clocks per FFT", (float)(ippGetCpuClocks() - startClocks) / (float)LOOP_MAX);&lt;BR /&gt;   &lt;BR /&gt;  printf("\n\nPress any key to exit...\n");&lt;BR /&gt;  getch();&lt;BR /&gt; &lt;BR /&gt;  return 0;&lt;BR /&gt;}&lt;BR /&gt;[/cpp]&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/CONIO.H&gt;&lt;/CMATH&gt;&lt;/CSTDLIB&gt;&lt;/CSTDIO&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 08 Jul 2010 16:31:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783236#M1723</guid>
      <dc:creator>josh83</dc:creator>
      <dc:date>2010-07-08T16:31:30Z</dc:date>
    </item>
    <item>
      <title>complex FTT vs real FFT, which one is faster?</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783237#M1724</link>
      <description>Real FFT's should be faster. A real FFT can assume that all the input (or output) imaginary values are 0. Which enables some extra optimizations.</description>
      <pubDate>Thu, 08 Jul 2010 22:40:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783237#M1724</guid>
      <dc:creator>piet_de_weer</dc:creator>
      <dc:date>2010-07-08T22:40:22Z</dc:date>
    </item>
    <item>
      <title>complex FTT vs real FFT, which one is faster?</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783238#M1725</link>
      <description>&lt;P&gt;Real FFT functions are faster than complex FFT functionsfor the identical size of transform (their execution time is less) because theyperform 2x less calculations.&lt;/P&gt;&lt;P&gt;Performance of FFT functions (&lt;A href="http://www.fftw.org"&gt;www.fftw.org&lt;/A&gt;) is calculatedaccording to thenext formulas&lt;/P&gt;&lt;P&gt;5*n*log (n)/time - for complex FFT&lt;/P&gt;&lt;P&gt;5*n*log (n)/time/2 - for real FFT&lt;/P&gt;&lt;P&gt;Therefore if the execution time of real FFT function is more than half (0.5x)of the execution time of complex FFT function its performanceis less.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jul 2010 09:01:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/complex-FTT-vs-real-FFT-which-one-is-faster/m-p/783238#M1725</guid>
      <dc:creator>igorastakhov</dc:creator>
      <dc:date>2010-07-16T09:01:54Z</dc:date>
    </item>
  </channel>
</rss>

