<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic  &amp;gt;&amp;gt;&amp;gt; An MKL optimized for AVX in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000470#M18478</link>
    <description>&lt;P&gt;&amp;nbsp;&amp;gt;&amp;gt;&amp;gt; An MKL optimized for AVX could achieve double the performance per thread of the older CPU.&amp;gt;&amp;gt;&amp;gt;&lt;/P&gt;

&lt;P&gt;Forgotten to mention this in my post.&lt;/P&gt;</description>
    <pubDate>Wed, 16 Jul 2014 14:23:09 GMT</pubDate>
    <dc:creator>Bernard</dc:creator>
    <dc:date>2014-07-16T14:23:09Z</dc:date>
    <item>
      <title>Xeon w3680 runs slower than i5-3320M</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000466#M18474</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I have strange problem where the same program runs slower on a&amp;nbsp;&lt;SPAN style="color: rgb(85, 85, 85); font-family: Arial, Helvetica, sans-serif; font-size: 14px; line-height: normal;"&gt;i5-3320M than an&amp;nbsp;&lt;/SPAN&gt;Xeon w3680. The main part of the code is FFT. Could anyone shine some light on me? Thanks!&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Jul 2014 20:35:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000466#M18474</guid>
      <dc:creator>Bo_Q_</dc:creator>
      <dc:date>2014-07-14T20:35:15Z</dc:date>
    </item>
    <item>
      <title>Take a look at the comparison</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000467#M18475</link>
      <description>&lt;P&gt;Take a look at the comparison of these 2 processors here: &lt;A href="http://ark.intel.com/compare/47917,64896" target="_blank"&gt;http://ark.intel.com/compare/47917,64896&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;i5-3320M processor is a much newer generation of architectures than Xeon W3680. However Xeon W3680 has 6 CPU cores while i5-3320M has only 2. Cache size on Xeon W3680 is 4x the cache size of i5-3320M. It shouldn't be a surprise that the old server CPU can outperform the newer desktop CPU for some workloads.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Jul 2014 17:20:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000467#M18475</guid>
      <dc:creator>Zhang_Z_Intel</dc:creator>
      <dc:date>2014-07-15T17:20:40Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;&gt;The main part of the code</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000468#M18476</link>
      <description>&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;The main part of the code is FFT. Could anyone shine some light on me? Thanks!&amp;gt;&amp;gt;&amp;gt;&lt;/P&gt;

&lt;P&gt;As @Zhang Z hinted probably older Xeon makes a better use of &amp;nbsp;available cores( more execution units) and larger cache.&lt;/P&gt;

&lt;P&gt;Btw @Bo Q&amp;nbsp; your thread's title states that Xeon runs slower than i5.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Jul 2014 09:35:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000468#M18476</guid>
      <dc:creator>Bernard</dc:creator>
      <dc:date>2014-07-16T09:35:10Z</dc:date>
    </item>
    <item>
      <title>I've spent a fair amount of</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000469#M18477</link>
      <description>&lt;P&gt;I've spent a fair amount of time posting about why the new 2-core CPU might match performance of the old 6-core, but my posts all vanished.&lt;/P&gt;

&lt;P&gt;1.&amp;nbsp; An MKL optimized for AVX could achieve double the performance per thread of the older CPU.&lt;/P&gt;

&lt;P&gt;2.&amp;nbsp; The older CPU may be as dependent (or more so) on 32-byte data alignment, particularly as you are probably attempting more threads.&lt;/P&gt;

&lt;P&gt;3.&amp;nbsp; The 6-core Westmere is likely to be more dependent on optimizing NUM_THREADS and affinity.&amp;nbsp; Although MKL should attempt&amp;nbsp;automatically to use 1 thread per core in spite of HyperThreading, I haven 't seen any software which deals automatically with the asymmetric arrangement of Westmere 6-core.&amp;nbsp; Under the usual BIOS arrangement where cores 0 and 1 share cache access, likewise cores 2 and 3, while cores 4 and 5 don't share paths to cache, try something such as&lt;/P&gt;

&lt;P&gt;set OMP_NUM_THREADS=4&lt;/P&gt;

&lt;P&gt;set KMP_AFFINITY="proclist=[3,7,9,11],explicit,verbose"&lt;/P&gt;

&lt;P&gt;(if you disabled HT, [1,3,4,5] would use the same cores as the above settings for HT enabled)&lt;/P&gt;

&lt;P&gt;verbose is to get the confirmation of affinity settings echoed to the screen.&lt;/P&gt;

&lt;P&gt;By tacking on 2 additional cores, WSM 6-core typically gains more than 20% performance over the equivalent 4-core CPU at the same clock speed, provided that the affinity requirements are observed.&amp;nbsp; You would have to read the ads carefully to see that 50% gain isn't expected even for applications with good threaded scaling.&amp;nbsp;&amp;nbsp; Many of the customers who took the trouble to understand the situation but didn't want to deal with the special affinities chose to buy the 4-core model.&lt;/P&gt;

&lt;P&gt;Among the advantages of the newer CPUs are less arcane affinity requirements (although you might not feel that way about Intel(r) Xeon Phi(tm))&lt;/P&gt;

&lt;P&gt;Perhaps you didn't find the web search references edifying, but you didn't even mention what you didn't understand.&amp;nbsp; I noticed that Google doesn't return nearly as many references as Bing or Yahoo.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Jul 2014 12:10:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000469#M18477</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2014-07-16T12:10:01Z</dc:date>
    </item>
    <item>
      <title> &gt;&gt;&gt; An MKL optimized for AVX</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000470#M18478</link>
      <description>&lt;P&gt;&amp;nbsp;&amp;gt;&amp;gt;&amp;gt; An MKL optimized for AVX could achieve double the performance per thread of the older CPU.&amp;gt;&amp;gt;&amp;gt;&lt;/P&gt;

&lt;P&gt;Forgotten to mention this in my post.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Jul 2014 14:23:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000470#M18478</guid>
      <dc:creator>Bernard</dc:creator>
      <dc:date>2014-07-16T14:23:09Z</dc:date>
    </item>
    <item>
      <title>Thanks all replies are very</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000471#M18479</link>
      <description>&lt;P&gt;Thanks all replies are very helpful! Also, I found setting /QxHost option seems to help as well.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Jul 2014 16:43:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000471#M18479</guid>
      <dc:creator>Bo_Q_</dc:creator>
      <dc:date>2014-07-16T16:43:08Z</dc:date>
    </item>
    <item>
      <title>On the Westmere, /QxHost was</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000472#M18480</link>
      <description>&lt;P&gt;On the Westmere, /QxHost was not always as good as /arch:SSE4.1, but these options will not influence MKL.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Jul 2014 17:03:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Xeon-w3680-runs-slower-than-i5-3320M/m-p/1000472#M18480</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2014-07-16T17:03:49Z</dc:date>
    </item>
  </channel>
</rss>

