<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using Intel IPP with TBB in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779360#M1399</link>
    <description>&lt;DIV&gt;Hi chao:&lt;/DIV&gt;&lt;DIV&gt;  Thank you for your answer.You are right!&lt;/DIV&gt;&lt;DIV&gt;Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. This is very common in many applications.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Best Regards,&lt;/DIV&gt;&lt;DIV&gt;Sun Cao&lt;/DIV&gt;</description>
    <pubDate>Fri, 27 Jul 2012 08:08:33 GMT</pubDate>
    <dc:creator>caosun</dc:creator>
    <dc:date>2012-07-27T08:08:33Z</dc:date>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779353#M1392</link>
      <description>I am trying to use TBB and IPP together to gain speed performance.&lt;BR /&gt;I use TBB todo filtering with IPPfunction "ippsFIR_32fc", each thead works on portion of data. But the results are quite strange. I can see a lot of glitch (very large values)into the output data.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;The code is as following:&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;parallel_for(tbb::blocked_range&lt;SIZE_T&gt; (0, inV.Length, inV.Length/1.5), tbb_parallel_fir_task((Ipp32fc *)inV.Data, filterCoefCP, filterVP-&amp;gt;Length, (Ipp32fc *)outVP-&amp;gt;Data, m_stateP));&lt;/SIZE_T&gt;&lt;/P&gt;&lt;P&gt;void operator() (const blocked_range&lt;SIZE_T&gt;&amp;amp; r) const&lt;BR /&gt;{&lt;/SIZE_T&gt;&lt;/P&gt;&lt;P&gt;Int begin = r.begin();&lt;/P&gt;&lt;P&gt;Int end = r.end();&lt;/P&gt;&lt;P&gt;Int nIters = end - begin;&lt;/P&gt;&lt;P&gt;ippsFIR_32fc(m_inP + begin, m_outP + begin, nIters, m_stateP);&lt;/P&gt;&lt;P&gt;}&lt;BR /&gt;&lt;BR /&gt;If I remove the IPP function "ippsFIR_32fc" with "ippsCopy_32f", the multiple thread copy functionality works fine.&lt;/P&gt;&lt;P&gt;Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. I indeed see complex input data and complex filter coefficients OR real input data and real filter coefficients.&lt;/P&gt;&lt;P&gt;Note: I have already use function 'ippSetNumThreads(1)' to set IPP internal OpenMP threads number to 1.&lt;BR /&gt;Could you please help me?&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jul 2012 09:57:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779353#M1392</guid>
      <dc:creator>caosun</dc:creator>
      <dc:date>2012-07-24T09:57:04Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779354#M1393</link>
      <description>I haven't used the ippsFIR routines before, but I think you may need to consider boundary conditions. Can you really break the sample space up evenly, or do you need to include some overlap due to the filtering?&lt;BR /&gt;&lt;BR /&gt;Also, if you're using IPP with TBB, I'd recommend linking with the unthreaded static libs instead of the DLLs. I don't think there's a way to completely disable OpenMP when linking with the DLLs, and this is why you need to call ippSetNumThreads(1).&lt;BR /&gt;&lt;BR /&gt;Peter&lt;BR /&gt;</description>
      <pubDate>Tue, 24 Jul 2012 11:41:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779354#M1393</guid>
      <dc:creator>pvonkaenel</dc:creator>
      <dc:date>2012-07-24T11:41:50Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779355#M1394</link>
      <description>I did the experiment:&lt;DIV&gt;  1. I set the grid size so that only two TBB thread is used, each TBB thread calculate half of the data.&lt;/DIV&gt;&lt;DIV&gt;  2. If only the first half data (first TBB thread) is filtered (the second TBB thread do nothing), the result is fine (The first half output is correct, the second half output is not calculated).&lt;/DIV&gt;&lt;DIV&gt;  3. If only the second half data is filtered, the second half output is correct.&lt;/DIV&gt;&lt;DIV&gt;  4. If the two TBB thread work together, the results are completely wrong with a lot of glitch, some data might be 1e34 (The correct data should be less than 1).&lt;/DIV&gt;</description>
      <pubDate>Wed, 25 Jul 2012 01:12:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779355#M1394</guid>
      <dc:creator>caosun</dc:creator>
      <dc:date>2012-07-25T01:12:00Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779356#M1395</link>
      <description>&lt;P&gt;I don't see anything wrong with the ipp call. It looks correct and you are using a version of the API that can be multi-threaded (any of the ippsFIR API's withSrcDst parameters cannot be multi-threaded). But what is thattilesize in the parallel for? inv.Length/1.5?Try takingthe default tile size and report back with the results.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Jul 2012 16:07:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779356#M1395</guid>
      <dc:creator>Bob_Davies</dc:creator>
      <dc:date>2012-07-25T16:07:00Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779357#M1396</link>
      <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A jquery1343263481734="60" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=570861" href="https://community.intel.com/en-us/profile/570861/" class="basic"&gt;caosun&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;&lt;I&gt;I am trying to use TBB and IPP together to gain speed performance.&lt;BR /&gt;...&lt;BR /&gt;Could you please help me?&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;BR /&gt;Hi, I could look at the problem and here aretwo questions:&lt;BR /&gt;&lt;BR /&gt; Could you post a small test-case?&lt;BR /&gt; What is your TBB version?&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jul 2012 00:44:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779357#M1396</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-07-26T00:44:21Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779358#M1397</link>
      <description>&lt;P&gt;Hello, &lt;/P&gt;&lt;P&gt;You may find the "state structures" in IPP here: "state structures that are modified during operation":&lt;BR /&gt;&lt;A href="http://software.intel.com/sites/products/documentation/hpc/ipp/ippi/ippi_ch2/ch2_function_context_structures.html"&gt;http://software.intel.com/sites/products/documentation/hpc/ipp/ippi/ippi_ch2/ch2_function_context_structures.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;so, each threading should has its own status structures. From the code you post here, it looks the "m_stateP" is shared by multiple tasks, which may create incorrect result. &lt;/P&gt;&lt;P&gt;Thanks,&lt;BR /&gt;chao&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jul 2012 00:51:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779358#M1397</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2012-07-26T00:51:46Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779359#M1398</link>
      <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A jquery1343269183078="60" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=21699" href="https://community.intel.com/en-us/profile/21699/" class="basic"&gt;Chao Y (Intel)&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;&lt;I&gt;...so, each threading should has its own status structures. From the code you post here, it looks the "m_stateP" is shared by multiple tasks, which may create incorrect result.&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;BR /&gt;Hi Chao, It means that an array of filter states has to be used instead, like:&lt;BR /&gt;&lt;BR /&gt; ...&lt;BR /&gt; &lt;STRONG&gt;IppsFIRState_32f&lt;/STRONG&gt; *pState[ &lt;STRONG&gt;&lt;NUM_OF_THREADS&gt;&lt;/NUM_OF_THREADS&gt;&lt;/STRONG&gt;&amp;gt; ];&lt;BR /&gt; ...&lt;BR /&gt;&lt;BR /&gt;Since forevery TBBthreadyou can geta ThreadID, or asimilar uniqueID,it is possible to map the &lt;STRONG&gt;pState&lt;/STRONG&gt; variables to&lt;BR /&gt;a proper processing thread.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jul 2012 02:26:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779359#M1398</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-07-26T02:26:56Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779360#M1399</link>
      <description>&lt;DIV&gt;Hi chao:&lt;/DIV&gt;&lt;DIV&gt;  Thank you for your answer.You are right!&lt;/DIV&gt;&lt;DIV&gt;Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. This is very common in many applications.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Best Regards,&lt;/DIV&gt;&lt;DIV&gt;Sun Cao&lt;/DIV&gt;</description>
      <pubDate>Fri, 27 Jul 2012 08:08:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779360#M1399</guid>
      <dc:creator>caosun</dc:creator>
      <dc:date>2012-07-27T08:08:33Z</dc:date>
    </item>
    <item>
      <title>Using Intel IPP with TBB</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779361#M1400</link>
      <description>&lt;P&gt;thanks for the letting us know. For the FIR function, I think you may just use the complex function as a workaround. &lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Chao&lt;/P&gt;</description>
      <pubDate>Fri, 03 Aug 2012 06:34:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Using-Intel-IPP-with-TBB/m-p/779361#M1400</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2012-08-03T06:34:01Z</dc:date>
    </item>
  </channel>
</rss>

