<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: IPP H264 performance in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955077#M18988</link>
    <description>I've run the tests. Nothing new. See my results in attachement. Similar results you can find in tools/perfsys/data. For instance, look at worst-case horizontal quarter-pixel interpolation for luma and regular 8x8 idct for comparison.&lt;BR /&gt;&lt;BR /&gt;ps_ippvcpx.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x3192 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=16,35,px,0.719&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,60,e,4.84&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,54,e,4.34&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,41,e,3.32&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,55,e,4.45&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;ps_ippvca6.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x3192 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=16,11,px,0.236&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,57,e,4.58&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,57,e,4.58&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,3.45&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,53,e,4.25&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;ps_ippvcw7.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x3192 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=32,10,px,0.214&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,61,e,4.89&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,51,e,4.12&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,3.43&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,52,e,4.25&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;ps_ippvct7.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 1x2128 MHz, L1=8/12K, L2=1024K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=32,12,px,0.365&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,25,e,3.03&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,41,e,5.03&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,5.08&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,41,e,5.03&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;my box:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x2594 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=16,10,px,0.256&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,61,e,6.04&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,51,e,5.13&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,4.24&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,52,e,5.2&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;To my mind, these results are unambiguous.</description>
    <pubDate>Tue, 15 Jun 2004 16:19:36 GMT</pubDate>
    <dc:creator>peter2</dc:creator>
    <dc:date>2004-06-15T16:19:36Z</dc:date>
    <item>
      <title>IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955073#M18984</link>
      <description>Hi!&lt;BR /&gt;&lt;BR /&gt;I've evaluated H264 routines from IPP 4.0 trial on PC, and have found than performance doesn't increase compared to my C code. Since I've used merged libs, and called directly w7_ and a6_ routines, as well as px_, this basically means that all three contain equivalent code.&lt;BR /&gt;So, my questions:&lt;BR /&gt;1. I'm sure that optimized H264 routines will be available very soon for all platforms. Could you give me any hint when?&lt;BR /&gt;2. Will these routines be available for regular XScale and WMMX?&lt;BR /&gt;3. Is there any way to participate in pre-release code testing?&lt;BR /&gt;&lt;BR /&gt;Thanks in advance!&lt;BR /&gt;&lt;BR /&gt;Peter</description>
      <pubDate>Tue, 08 Jun 2004 13:32:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955073#M18984</guid>
      <dc:creator>peter2</dc:creator>
      <dc:date>2004-06-08T13:32:32Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955074#M18985</link>
      <description>Hi, Peter,&lt;BR /&gt;&lt;BR /&gt;
&lt;DIV&gt;To get general H.264 Performance you may run the Intel IPP performance benchmark tool "perfsys" located in directory ipp40	oolsperfsys. You can choose the ps_ippvc.exe to run to get the H.264 performance data on your target system.&lt;BR /&gt;&lt;BR /&gt;We will consider H.264 support for Intel XScale and WMMX in future releases as well, you may periodically check our web site at &lt;A href="http://www.intel.com/software/products/ipp" target="_blank"&gt;http://www.intel.com/software/products/ipp&lt;/A&gt; for update.&lt;BR /&gt;&lt;BR /&gt;If you are interested in participating the pre-release test, please submit a request under Intel IPP productsvia Intel Premier Support.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Ying S&lt;BR /&gt;Intel Corp. &lt;BR /&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 15 Jun 2004 12:35:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955074#M18985</guid>
      <dc:creator>Ying_S_Intel</dc:creator>
      <dc:date>2004-06-15T12:35:33Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955075#M18986</link>
      <description>Thanks! I'll put a request.</description>
      <pubDate>Tue, 15 Jun 2004 13:15:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955075#M18986</guid>
      <dc:creator>peter2</dc:creator>
      <dc:date>2004-06-15T13:15:14Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955076#M18987</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;Should you try to run this test, it would be great to share the results here if you have time ...&lt;/P&gt;
&lt;P&gt;Thanks a lot&lt;/P&gt;
&lt;P&gt;Marc&lt;/P&gt;
&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 15 Jun 2004 14:13:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955076#M18987</guid>
      <dc:creator>marc_ba</dc:creator>
      <dc:date>2004-06-15T14:13:00Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955077#M18988</link>
      <description>I've run the tests. Nothing new. See my results in attachement. Similar results you can find in tools/perfsys/data. For instance, look at worst-case horizontal quarter-pixel interpolation for luma and regular 8x8 idct for comparison.&lt;BR /&gt;&lt;BR /&gt;ps_ippvcpx.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x3192 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=16,35,px,0.719&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,60,e,4.84&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,54,e,4.34&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,41,e,3.32&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,55,e,4.45&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;ps_ippvca6.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x3192 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=16,11,px,0.236&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,57,e,4.58&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,57,e,4.58&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,3.45&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,53,e,4.25&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;ps_ippvcw7.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x3192 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=32,10,px,0.214&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,61,e,4.89&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,51,e,4.12&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,3.43&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,52,e,4.25&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;ps_ippvct7.csv:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 1x2128 MHz, L1=8/12K, L2=1024K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=32,12,px,0.365&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,25,e,3.03&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,41,e,5.03&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,5.08&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,41,e,5.03&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;my box:&lt;BR /&gt;CPU,Intel Pentium 4 Processor HT 2x2594 MHz, L1=8/12K, L2=512K&lt;BR /&gt;...&lt;BR /&gt;ippiDCTInv_8x8,16s8u,-,8x8,-,-,-,-,-,nLps=16,10,px,0.256&lt;BR /&gt;...&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,0,-,-,nLps=16,61,e,6.04&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,1,-,-,nLps=16,51,e,5.13&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,2,-,-,nLps=16,42,e,4.24&lt;BR /&gt;ippiInterpolateLuma_H264,8u,C1R,16,16,3,3,-,-,nLps=16,52,e,5.2&lt;BR /&gt;-----------------------&lt;BR /&gt;&lt;BR /&gt;To my mind, these results are unambiguous.</description>
      <pubDate>Tue, 15 Jun 2004 16:19:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955077#M18988</guid>
      <dc:creator>peter2</dc:creator>
      <dc:date>2004-06-15T16:19:36Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955078#M18989</link>
      <description>&lt;P&gt;You can see the performance of ippiDCT function has improvement onlatest architectures. It is because this function was tightly optimizaed by hand on assemble level. Yes, you are right, the performance of ippiInterpolateLuma_H264 does not show performance gain, it is because this function initially was optimized in C code, now we work onoptimization of this function on assemble level. You will see improved performance in the next version of libraries.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt; Vladimir&lt;/P&gt;
&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 15 Jun 2004 18:09:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955078#M18989</guid>
      <dc:creator>Vladimir_Dudnik</dc:creator>
      <dc:date>2004-06-15T18:09:08Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955079#M18990</link>
      <description>I've tried 4.1 beta and I can confirm luma and chroma interpoltation were MMX Ext and SSE2. As well as deblocking. Dequant was not MMX enhaced. I haven't tried 4.1 release for x86 yet, but 4.1 release for XScale doesn't differ form 4.1 beta for XScale too much. At least I haven't noticed any differences in H264 part.</description>
      <pubDate>Tue, 12 Oct 2004 13:26:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955079#M18990</guid>
      <dc:creator>peter2</dc:creator>
      <dc:date>2004-10-12T13:26:21Z</dc:date>
    </item>
    <item>
      <title>Re: IPP H264 performance</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955080#M18991</link>
      <description>&lt;DIV&gt;Hi,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;You are right, this function has 15 different branches inside. Each branch has their special conditions and was optimized separately. So, we still work on some of branches and we are hoping we will improve this function in future.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Regards,&lt;/DIV&gt;
&lt;DIV&gt; Vladimir&lt;/DIV&gt;</description>
      <pubDate>Tue, 12 Oct 2004 15:29:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/IPP-H264-performance/m-p/955080#M18991</guid>
      <dc:creator>Vladimir_Dudnik</dc:creator>
      <dc:date>2004-10-12T15:29:25Z</dc:date>
    </item>
  </channel>
</rss>

