<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic IPP FFT performance no improved with multiple threads in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-FFT-performance-no-improved-with-multiple-threads/m-p/816760#M4347</link>
    <description>&lt;P&gt;I have the problem with FFT (IPP ver 7.0), ippsFFTFwd_CToC_32fc. The FFT len 2^19. According to ThreadedFunctionsList.txt, "ippsFFTFwd_CToC_32fc" is threaded.&lt;/P&gt;&lt;P&gt;I run it on 12 cores machine (L5640 2x6),through Parallel Studio, Visual Studio 2010 under Windows Server 2008, 64bit.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;And see that only one core is working.&lt;/STRONG&gt; And I did all that wroted in doc.&lt;/P&gt;&lt;P&gt;For instance, Direct FIR function is very good parallelized.&lt;/P&gt;&lt;P&gt;Can you help me with FFT ?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 26 May 2011 15:33:22 GMT</pubDate>
    <dc:creator>arkgr</dc:creator>
    <dc:date>2011-05-26T15:33:22Z</dc:date>
    <item>
      <title>IPP FFT performance no improved with multiple threads</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-FFT-performance-no-improved-with-multiple-threads/m-p/816760#M4347</link>
      <description>&lt;P&gt;I have the problem with FFT (IPP ver 7.0), ippsFFTFwd_CToC_32fc. The FFT len 2^19. According to ThreadedFunctionsList.txt, "ippsFFTFwd_CToC_32fc" is threaded.&lt;/P&gt;&lt;P&gt;I run it on 12 cores machine (L5640 2x6),through Parallel Studio, Visual Studio 2010 under Windows Server 2008, 64bit.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;And see that only one core is working.&lt;/STRONG&gt; And I did all that wroted in doc.&lt;/P&gt;&lt;P&gt;For instance, Direct FIR function is very good parallelized.&lt;/P&gt;&lt;P&gt;Can you help me with FFT ?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2011 15:33:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-FFT-performance-no-improved-with-multiple-threads/m-p/816760#M4347</guid>
      <dc:creator>arkgr</dc:creator>
      <dc:date>2011-05-26T15:33:22Z</dc:date>
    </item>
    <item>
      <title>IPP FFT performance no improved with multiple threads</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-FFT-performance-no-improved-with-multiple-threads/m-p/816761#M4348</link>
      <description>&lt;P&gt;Hello, &lt;/P&gt;&lt;P&gt;This looks a problem we discussed in the forum before. Please find some comments from the function expert on the performance: &lt;/P&gt;&lt;P&gt;1)FFT function uses memory buffer ~equal to vector length for rather small FFT orders ( &amp;lt; ~19  depends on platform (cache size))  therefore for such orders there is no difference between in-place and out-of-place cases performance  FFT is calculated in the buffer and then result is copied to the destination  so for in-cache cases it doesnt matter where to copy  to src or to dst vector. For rather large orders (&amp;gt;19) in-place version is faster as internally FFT uses buffer of smaller size (less than input vector length). I think that HDD case should not be discussed here&lt;/P&gt;&lt;P&gt;2) FFT is threaded for fit into shared L2 cases only and for Core2 CPUs only (and on 2 threads only). For small orders OMP overhead is greater than benefit, for large orders (out-of-cache) memory effects play negative role  so customers investigation is right  there is no any threading for order 19 and above. &lt;/P&gt;&lt;P&gt;Thanks,&lt;BR /&gt;Chao &lt;/P&gt;</description>
      <pubDate>Fri, 27 May 2011 08:00:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-FFT-performance-no-improved-with-multiple-threads/m-p/816761#M4348</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2011-05-27T08:00:47Z</dc:date>
    </item>
  </channel>
</rss>

