<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic A question to IDZ in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972008#M20648</link>
    <description>A question to IDZ Administrators / Moderators:

How could anyone edit an original ( 1st ) post of a just created thread? I remember that editing was available in the past.</description>
    <pubDate>Sat, 25 May 2013 17:10:36 GMT</pubDate>
    <dc:creator>SergeyKostrov</dc:creator>
    <dc:date>2013-05-25T17:10:36Z</dc:date>
    <item>
      <title>Performance evaluation of ippsAdd_32f and ippsSub_32f vs. a simple 2-for-loop implementation with /O3 optimization</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972007#M20647</link>
      <description>&lt;P&gt;I've completed a&amp;nbsp;performance evaluation&amp;nbsp;of some linear algebra algorithm&amp;nbsp;that uses&amp;nbsp;&lt;STRONG&gt;ippsAdd_32f&lt;/STRONG&gt; and &lt;STRONG&gt;ippsSub_32f&lt;/STRONG&gt; IPP functions vs. a simple &lt;STRONG&gt;2-for-loop&lt;/STRONG&gt; implementation (&amp;nbsp;of the same functionality in the same&amp;nbsp;algorithm )&amp;nbsp;compiled&amp;nbsp;with&amp;nbsp;&lt;STRONG&gt;/O3&lt;/STRONG&gt; ( Intel C++ compiler&amp;nbsp;) and &lt;STRONG&gt;/O2&lt;/STRONG&gt; ( Microsoft C++ compiler&amp;nbsp;)&amp;nbsp;optimizations and my results are very interesting.&lt;/P&gt;
&lt;P&gt;In a couple of words: There was just &lt;STRONG&gt;~0.30%&lt;/STRONG&gt; performance improvement when IPP functions are used&amp;nbsp;and I would consider it as negligible. I also&amp;nbsp;provide test results later.&lt;/P&gt;
&lt;P&gt;Thanks and ask&amp;nbsp;questions if interested.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 25 May 2013 17:06:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972007#M20647</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-05-25T17:06:51Z</dc:date>
    </item>
    <item>
      <title>A question to IDZ</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972008#M20648</link>
      <description>A question to IDZ Administrators / Moderators:

How could anyone edit an original ( 1st ) post of a just created thread? I remember that editing was available in the past.</description>
      <pubDate>Sat, 25 May 2013 17:10:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972008#M20648</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-05-25T17:10:36Z</dc:date>
    </item>
    <item>
      <title>[ Test results when IPP</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972009#M20649</link>
      <description>&lt;STRONG&gt;[ Test results when IPP library is Not Used ]&lt;/STRONG&gt;
...
Calculating...
Add - Completed in 3.554 ms
Add - Completed in 3.395 ms
Add - Completed in 3.525 ms
Sub - Completed in 3.367 ms
Sub - Completed in 3.127 ms
Add - Completed in 3.126 ms
Sub - Completed in 3.364 ms
Add - Completed in 3.506 ms
Sub - Completed in 3.491 ms
Add - Completed in 3.441 ms
Add - Completed in 3.103 ms
Sub - Completed in 2.968 ms
Add - Completed in 3.294 ms
Add - Completed in 3.094 ms
Add - Completed in 3.114 ms
Sub - Completed in 2.777 ms
Add - Completed in 2.756 ms
Add - Completed in 3.009 ms
( Algorithm ) - Pass  1 - Completed: 75.89500 secs
Add - Completed in 3.541 ms
Add - Completed in 3.556 ms
Add - Completed in 3.526 ms
Sub - Completed in 3.384 ms
Sub - Completed in 3.143 ms
Add - Completed in 3.148 ms
Sub - Completed in 3.363 ms
Add - Completed in 3.419 ms
Sub - Completed in 3.484 ms
Add - Completed in 3.423 ms
Add - Completed in 3.124 ms
Sub - Completed in 3.084 ms
Add - Completed in 2.904 ms
Add - Completed in 3.202 ms
Add - Completed in 3.128 ms
Sub - Completed in 2.770 ms
Add - Completed in 2.779 ms
Add - Completed in 3.039 ms
( Algorithm ) - Pass  2 - Completed: 75.87800 secs
...</description>
      <pubDate>Sun, 26 May 2013 00:54:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972009#M20649</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-05-26T00:54:58Z</dc:date>
    </item>
    <item>
      <title>[ Test results when IPP</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972010#M20650</link>
      <description>&lt;STRONG&gt;[ Test results when IPP library is Used ]&lt;/STRONG&gt;
...
Calculating...
Add - Completed in 3.518 ms
Add - Completed in 3.401 ms
Add - Completed in 3.364 ms
Sub - Completed in 3.280 ms
Sub - Completed in 2.754 ms
Add - Completed in 2.830 ms
Sub - Completed in 3.280 ms
Add - Completed in 3.311 ms
Sub - Completed in 3.305 ms
Add - Completed in 3.062 ms
Add - Completed in 2.954 ms
Sub - Completed in 2.595 ms
Add - Completed in 2.790 ms
Add - Completed in 3.178 ms
Add - Completed in 3.177 ms
Sub - Completed in 2.726 ms
Add - Completed in 2.724 ms
Add - Completed in 2.997 ms
( Algorithm ) - Pass  1 - Completed: 75.63000 secs
Add - Completed in 3.500 ms
Add - Completed in 3.381 ms
Add - Completed in 3.443 ms
Sub - Completed in 3.256 ms
Sub - Completed in 2.773 ms
Add - Completed in 2.839 ms
Sub - Completed in 3.296 ms
Add - Completed in 3.431 ms
Sub - Completed in 3.290 ms
Add - Completed in 3.062 ms
Add - Completed in 2.955 ms
Sub - Completed in 2.594 ms
Add - Completed in 2.844 ms
Add - Completed in 3.173 ms
Add - Completed in 3.181 ms
Sub - Completed in 2.742 ms
Add - Completed in 3.123 ms
Add - Completed in 2.938 ms
( Algorithm ) - Pass  2 - Completed: 75.61300 secs
...</description>
      <pubDate>Sun, 26 May 2013 00:55:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972010#M20650</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-05-26T00:55:56Z</dc:date>
    </item>
    <item>
      <title>With reduced output details..</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972011#M20651</link>
      <description>With reduced output details...

&lt;STRONG&gt;[ A larger Data set - Test 1 - Algorithm with IPP - faster for 0.29% then Test 2 ]&lt;/STRONG&gt;
...
Calculating...
Algorithm - Pass  1 - Completed: 114.35901 secs
Algorithm - Pass  2 - Completed: 114.10901 secs
Algorithm - Pass  3 - Completed: 114.07801 secs	Note: Best Time ( BT1 )
Algorithm - Pass  4 - Completed: 114.07901 secs
Algorithm - Pass  5 - Completed: 114.09301 secs
...

&lt;STRONG&gt;[ A larger Data set - Test 2 - Algorithm without IPP - slower for 0.29% then Test 1 ]&lt;/STRONG&gt;
...
Calculating...
Algorithm - Pass  1 - Completed: 114.76601 secs
Algorithm - Pass  2 - Completed: 114.40601 secs	Note: Best Time ( BT2 )
Algorithm - Pass  3 - Completed: 114.40601 secs
Algorithm - Pass  4 - Completed: 114.46901 secs
Algorithm - Pass  5 - Completed: 114.42201 secs
...

&lt;STRONG&gt;Hardware &amp;amp; Software details:&lt;/STRONG&gt;

Dell Precision Mobile M4700
Intel Core i7-3840QM ( Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/compare/70846 )
32GB RAM
320GB HDD
NVIDIA Quadro K1000M ( 192 CUDA cores / 2GB memory )
Windows 7 Professional 64-bit

Size of L3 Cache = 8MB ( shared between all cores for data &amp;amp; instructions )
Size of L2 Cache = 1MB ( 256KB per core / shared for data &amp;amp; instructions )
Size of L1 Cache = 256KB ( 32KB per core for data &amp;amp; 32KB per core for instructions )</description>
      <pubDate>Sun, 26 May 2013 00:59:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Performance-evaluation-of-ippsAdd-32f-and-ippsSub-32f-vs-a/m-p/972011#M20651</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-05-26T00:59:53Z</dc:date>
    </item>
  </channel>
</rss>

