<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic I check the performance of in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964771#M20040</link>
    <description>&lt;P&gt;I check the performance of sort function on an 8Mb array and I obtaine&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;for IPP8-0 : 730CPU&lt;/LI&gt;
&lt;LI&gt;for IPP7-1 : 685CPU&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These 2 time are always diferent&lt;/P&gt;</description>
    <pubDate>Thu, 01 Aug 2013 09:33:04 GMT</pubDate>
    <dc:creator>P_v_</dc:creator>
    <dc:date>2013-08-01T09:33:04Z</dc:date>
    <item>
      <title>Decrease performance of a sort function with IPP 8.0</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964767#M20036</link>
      <description>&lt;P&gt;Hello!!&lt;/P&gt;
&lt;P&gt;I have recently passed from IPP7.1 to IPP8.0. I compare the performance of the two versions.&lt;/P&gt;
&lt;P&gt;For the IPP8.0 and only for the sort function “ippsSortRadixIndexAscend_32f”, I note that the performance is worse than IPP7.1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;I obtain:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;55 CPE for IPP7.1&lt;/LI&gt;
&lt;LI&gt;78 CPE for IPP8.0&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;(The vectors are&amp;nbsp;constituted by 1024 random samples and I realize 40,000 executions to obtain these averages. I use the function ippGetCpuClocks() to obtain the number of cpu clocks. My processor is an Intel Dual Core E5400, 2.70Ghz).&lt;/P&gt;
&lt;P&gt;Have you got an explication?&lt;/P&gt;
&lt;P&gt;Thank you,&lt;/P&gt;
&lt;P&gt;Pierre&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2013 09:17:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964767#M20036</guid>
      <dc:creator>P_v_</dc:creator>
      <dc:date>2013-07-31T09:17:01Z</dc:date>
    </item>
    <item>
      <title>First you should verify which</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964768#M20037</link>
      <description>&lt;P&gt;First you should verify which CPU/Library was used in the IPP 7 case as well as for the IPP 8 case.&lt;/P&gt;
&lt;P&gt;Possibly, your IPP 7 code selects the best CPU/Library for your E5400, but your IPP 8 code not.&lt;/P&gt;
&lt;P&gt;If you link to dymamic IPP DLL's, you could just use for instance SysInternals ProcessExplorer, to see which IPP CPU/Library DLL was loaded.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2013 10:54:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964768#M20037</guid>
      <dc:creator>Thomas_Jensen1</dc:creator>
      <dc:date>2013-07-31T10:54:15Z</dc:date>
    </item>
    <item>
      <title>Thanks for your response.</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964769#M20038</link>
      <description>&lt;P&gt;Thanks for your response.&lt;/P&gt;
&lt;P&gt;The CPU/library is the same for the two versions (ippsv8-n°version.dll).&lt;/P&gt;
&lt;P&gt;I study the influence of the size of vectors. The gap (in number of CPU clocks) between the executions of two IPP versions is always the same whatever the size of input vector &amp;nbsp;(as if &amp;nbsp;“nop” functions has added in the sort function “ippsSortRadixIndexAscend_32f” of the IPP8.0 version).&lt;/P&gt;
&lt;P&gt;I run the same program on another processor (Intel Core i3-2100, 3.091Ghz ). For this processor, the performance of the sort fonction of the IPP8-0 version and the IPP7-1 version is the same.&lt;/P&gt;
&lt;P&gt;Possibly, the processor Dual Core E5400 is deprecated for the new version of IPP&amp;nbsp; ?&lt;/P&gt;
&lt;P&gt;Pierre&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2013 13:32:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964769#M20038</guid>
      <dc:creator>P_v_</dc:creator>
      <dc:date>2013-07-31T13:32:04Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;...The vectors are</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964770#M20039</link>
      <description>&amp;gt;&amp;gt;...The vectors are constituted by 1024 random samples and I realize 40,000 executions to obtain these averages..

Could you verify performance of &lt;STRONG&gt;ippsSortRadixIndexAscend_32f&lt;/STRONG&gt; functions ( v7.x and v8.x ) on an &lt;STRONG&gt;8MB&lt;/STRONG&gt; array with &lt;STRONG&gt;1,024&lt;/STRONG&gt; executions?</description>
      <pubDate>Wed, 31 Jul 2013 14:12:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964770#M20039</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-07-31T14:12:29Z</dc:date>
    </item>
    <item>
      <title>I check the performance of</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964771#M20040</link>
      <description>&lt;P&gt;I check the performance of sort function on an 8Mb array and I obtaine&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;for IPP8-0 : 730CPU&lt;/LI&gt;
&lt;LI&gt;for IPP7-1 : 685CPU&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These 2 time are always diferent&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2013 09:33:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964771#M20040</guid>
      <dc:creator>P_v_</dc:creator>
      <dc:date>2013-08-01T09:33:04Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;&gt;&gt;...The vectors are</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964772#M20041</link>
      <description>&amp;gt;&amp;gt;&amp;gt;&amp;gt;...The vectors are constituted by 1024 random samples...
&amp;gt;&amp;gt;...
&amp;gt;&amp;gt;...These 2 time are always diferent

Try to make your tests deterministic. It means, pre-generate an array of numbers and then use it to measure performance of both functions for all testing iterations. It is by design of many sorting algorithms to complete processing in different amounts of time when different data sets are used ( don't be confused with asymptotic complexity of a sorting algorithm ).

In overall, you should have reproducible measurements between tests and performance numbers should not differ for more than +/-( 0.5% - 1.0% ).</description>
      <pubDate>Thu, 01 Aug 2013 13:03:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964772#M20041</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-08-01T13:03:19Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;I check the performance of</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964773#M20042</link>
      <description>&amp;gt;&amp;gt;I check the performance of sort function on an 8Mb array and I obtaine
&amp;gt;&amp;gt;
&amp;gt;&amp;gt;- for IPP8-0 : 730CPU
&amp;gt;&amp;gt;- for IPP7-1 : 685CPU

In that test with random numbers &lt;STRONG&gt;ippsSortRadixIndexAscend_32&lt;/STRONG&gt; in IPP8-0 is &lt;STRONG&gt;~6.2%&lt;/STRONG&gt; slower than IPP7-1. Please repeat tests with the same numbers in the array as I already described.</description>
      <pubDate>Thu, 01 Aug 2013 13:10:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964773#M20042</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-08-01T13:10:03Z</dc:date>
    </item>
    <item>
      <title>Hi Pierre,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964774#M20043</link>
      <description>&lt;P&gt;Hi Pierre,&lt;/P&gt;
&lt;P&gt;"average" is not right for performance measurements - try to check for "min", please. "Average" includes dll load time and some other OS activities. I've checked both IPP versions with IPP PS (perf system, available in the package) for single threaded static libs - I don't observe any degradation. Your reproducible will be appreciated for more detailed analysis.&lt;/P&gt;
&lt;P&gt;regards, Igor&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2013 13:59:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964774#M20043</guid>
      <dc:creator>Igor_A_Intel</dc:creator>
      <dc:date>2013-08-01T13:59:09Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;&gt;pre-generate an array of</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964775#M20044</link>
      <description>&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;pre-generate an array of numbers&lt;/P&gt;
&lt;P&gt;I&amp;nbsp; try a pre-generate vectors of 1024 samples contained in binary file.&lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;...for single threaded static libs - I don't observe any degradation&lt;/P&gt;
&lt;P&gt;My previous test was with dynamic linker.&lt;/P&gt;
&lt;P&gt;I test for single thread statics libs and I &lt;STRONG&gt;don't observe degradation&lt;/STRONG&gt; for "average" and "min".&lt;/P&gt;
&lt;P&gt;Tanks for your helps&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2013 15:33:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964775#M20044</guid>
      <dc:creator>P_v_</dc:creator>
      <dc:date>2013-08-01T15:33:59Z</dc:date>
    </item>
    <item>
      <title>Pierre,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964776#M20045</link>
      <description>&lt;P&gt;Pierre,&lt;/P&gt;
&lt;P&gt;I guess I know the root of this issue - I think you've linked with dynamic libs installed by default - for 8.0 the default installation contains only single-threaded dynamic libraries (for multi-threaded you should check one more checkbox in the thin-client install) while 7.x has only multi-threaded dlls. This functionality is internally threaded.&lt;/P&gt;
&lt;P&gt;regards, Igor&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2013 16:44:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964776#M20045</guid>
      <dc:creator>Igor_A_Intel</dc:creator>
      <dc:date>2013-08-01T16:44:22Z</dc:date>
    </item>
    <item>
      <title>It is very easy to verify how</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964777#M20046</link>
      <description>It is very easy to verify how many threads were created for tests with the function ( in IPP version 7 and 8 ). Just take a look at Windows Task Manager ( Processes property page ). In case of IPP version 7 try to set number of threads to 1, repeat tests and please post results. Thanks.</description>
      <pubDate>Fri, 02 Aug 2013 03:54:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964777#M20046</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-08-02T03:54:00Z</dc:date>
    </item>
    <item>
      <title>I have verified the number of</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964778#M20047</link>
      <description>&lt;P&gt;I have verified the number of threads and I have&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;2 threads for IPP7-1 with dynamic linkage and 1 for static linkage&lt;/LI&gt;
&lt;LI&gt;1 threads for IPP8-0 with dynamic linkage and 1 for static linkage&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;I have set the number of threads to 1 in the case of IPP7-1 in the case of dynamic linkgage with the function &lt;EM&gt;ippSetNumThreads&lt;/EM&gt;.&lt;/P&gt;
&lt;P&gt;I observe again a significant difference between two version for the average CPU and the min CPU (around 25% of difference for the both)&lt;/P&gt;
&lt;P&gt;Pierre&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Aug 2013 08:54:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964778#M20047</guid>
      <dc:creator>P_v_</dc:creator>
      <dc:date>2013-08-02T08:54:55Z</dc:date>
    </item>
    <item>
      <title>Pierre,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964779#M20048</link>
      <description>&lt;P&gt;Pierre,&lt;/P&gt;
&lt;P&gt;IPP PS (perf system) doesn't show any difference - so could you attach your measuring program - I need some reproducer to understand/analyse the issue.&lt;/P&gt;
&lt;P&gt;regards, Igor&lt;/P&gt;</description>
      <pubDate>Fri, 02 Aug 2013 10:41:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964779#M20048</guid>
      <dc:creator>Igor_A_Intel</dc:creator>
      <dc:date>2013-08-02T10:41:56Z</dc:date>
    </item>
    <item>
      <title>I have tested the sort</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964780#M20049</link>
      <description>&lt;P&gt;I have tested the sort function "ippsSortRadixIndexAscend_32f" with Perf System.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For IPP7.1, I run the programm with the following command line &lt;EM&gt;ps_ipps.exe -r -o -f"ippsSortRadixIndexAscend_32f" -N1. &lt;/EM&gt;The option &lt;EM&gt;-N1&lt;/EM&gt; is used to set the number of threads to 1 (as IPP8.0)&lt;/P&gt;
&lt;P&gt;For IPP7.1, I run the programm with the following command line &lt;EM&gt;ps_ipps.exe -r -o -f"ippsSortRadixIndexAscend_32f"&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;The results for IPP7-1 are&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;&lt;/EM&gt;&lt;EM&gt;CPU,Processor supporting Supplemental Streaming SIMD Extension 3 instruction set, 2x2.66 GHz, Max cache size 2048 K&lt;BR /&gt;OS,Windows 7 Professional Service Pack 1 (Win32)&lt;BR /&gt;Computer,SIC-004&lt;BR /&gt;Library,ippSP SSE2 (w7), 7.1.1 (r37466), Sep 27 2012&lt;BR /&gt;Start,Fri Aug 02 17:03:25 2013&lt;BR /&gt;function,Parm1,Parm2,Parm3,Parm4,Parm5,Parm6,Parm7,Parm8,Comment,Clocks,per,Time (usec),MFlops&lt;BR /&gt;ippsSortRadixIndexAscend,32f,-,1024,1,-,-,-,-,nLps=8,&lt;STRONG&gt;64.7&lt;/STRONG&gt;,e,24.9,-&lt;BR /&gt;ippsSortRadixIndexAscend,32f,-,1024,2,-,-,-,-,nLps=8,&lt;STRONG&gt;56.1&lt;/STRONG&gt;,e,21.6,-&lt;/EM&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The results for IPP8-0 are&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;CPU,Processor supporting Supplemental Streaming SIMD Extension 3 instruction set, 2x2.66 GHz, Max cache size 2048 K&lt;BR /&gt;OS,Windows 7 Professional Service Pack 1 (Win32)&lt;BR /&gt;Computer,SIC-004&lt;BR /&gt;Library,ippSP SSE2 (w7), 8.0.0 (r40040), May 22 2013&lt;BR /&gt;Start,Fri Aug 02 17:18:13 2013&lt;BR /&gt;function,Parm1,Parm2,Parm3,Parm4,Parm5,Parm6,Parm7,Parm8,Comment,Clocks,per,Time (usec),MFlops&lt;BR /&gt;ippsSortRadixIndexAscend,32f,-,1024,1,-,-,-,-,nLps=8,&lt;STRONG&gt;80.3&lt;/STRONG&gt;,e,30.9,-&lt;BR /&gt;ippsSortRadixIndexAscend,32f,-,1024,2,-,-,-,-,nLps=8,&lt;STRONG&gt;75.2&lt;/STRONG&gt;,e,29,-&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;(I have tested Perfsys for the two versions with the Copy function of 1024 ipp32f, the results are similar between two versions)&lt;/P&gt;
&lt;P&gt;I'm sorry but, I will be out of office for the three weeks with no internet access. I could not ansver.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;regards&lt;/P&gt;
&lt;P&gt;Pierre&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Aug 2013 15:26:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Decrease-performance-of-a-sort-function-with-IPP-8-0/m-p/964780#M20049</guid>
      <dc:creator>P_v_</dc:creator>
      <dc:date>2013-08-02T15:26:00Z</dc:date>
    </item>
  </channel>
</rss>

