<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Integer vs unsigned integer performance in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/Integer-vs-unsigned-integer-performance/m-p/1156630#M78833</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Long post, please bear with me ;)&lt;/P&gt;

&lt;P&gt;Last weekend at a Hackathon we were discussing the performance of signed vs unsigned integers; at an HPC workshop someone mentioned to me that you should use "int" instead of "unsigned" or "size_t", as 'int' would be much faster.&amp;nbsp; Of course, nobody believed this, so I ended up cobbling together a piece of code to test it. The code is based on a (very dumb) example to determine whether a number is prime or not:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;int isPrime(MyIntegerType n)
{
  if (n &amp;lt;= 1) return 0;
  for (MyIntegerType i = 2; i &amp;lt; n; i ++)
  {
     if (n % i == 0) return 0;
  }
  return 1;
}&lt;/PRE&gt;

&lt;P&gt;&lt;BR /&gt;
	I realize that this algorithm is very bad (and not even 100% correct, as one can aruge whether '2' is prime or not), and I realize that a compiler needs to do bound checking and that bounds checks are different for signed and unsigned integers. However, this is about the performance of using signed vs unsigned, nothing else.&lt;/P&gt;

&lt;P&gt;With a wrapper around the above code I&amp;nbsp; ended up with a (64 bit) test program that tests this using&lt;BR /&gt;
	&amp;nbsp; MyIntegerType = {'int', 'unsigned', 'long', 'unsigned long', 'size_t' }&lt;BR /&gt;
	The results are quite surprising:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;It does not really matter which compiler you use for this; gcc 4.8, gcc 7 and icc&amp;nbsp; generate almost identical assembly code, but the assembly code for 'int' &lt;B class="moz-txt-star"&gt;&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;IS&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;&lt;/B&gt;&amp;nbsp; different from the code generated for 'unsigned int'&lt;/LI&gt;
	&lt;LI&gt;A good old Harpertown CPU can keep up in this test with CPUs that have a hi\gher clock speed and are 4+ years newer&lt;/LI&gt;
	&lt;LI&gt;Performance differs &lt;EM&gt;per hardware platform &lt;/EM&gt;using the same binary&lt;/LI&gt;
	&lt;LI&gt;On Ivy Bridge, Haswell and Broadwell CPUs it makes sense to use 'int' instead of 'unsigned int'&lt;/LI&gt;
	&lt;LI&gt;On &lt;STRONG&gt;KNL (+KNC)&lt;/STRONG&gt;&amp;nbsp; and Atom based CPUs it's exactly the other way round: it makes sense to use 'unsigned' instead of 'int'&lt;/LI&gt;
	&lt;LI&gt;On all others there is no significant difference between using or the other&lt;/LI&gt;
	&lt;LI&gt;There is a &lt;B class="moz-txt-star"&gt;&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;huge&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;&lt;/B&gt; penalty for using "long" on Intel CPUs&lt;/LI&gt;
	&lt;LI&gt;Performance of the KNL box is surprisingly low, even given the fact that it runs at 1.3 or 1.5 GHz - compared to e.g. a 3.3 GHz Pentium G4400&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Attached is a spreadsheet with all the platforms it was tested on so far.&lt;/P&gt;

&lt;P&gt;Now for my questions:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;can someone explain the difference in execution time between different platforms , given the fact that the same executable was used on all platforms (hence no compiler differences) ?&lt;/LI&gt;
	&lt;LI&gt;can someone explain why there is such a huge performance penalty for 64bit ints vs 32bit ints (and note that CPUs made by a certain rival company do not display this behaviour)&lt;/LI&gt;
	&lt;LI&gt;how can a programmer know up-front what will be the best type for a given algorithm+CPU?&lt;/LI&gt;
	&lt;LI&gt;why is performance on the KNL so bad compared to the rest ?&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Source code and/or binaries for the above test results is available upon request.&lt;/P&gt;

&lt;P&gt;Thx,&amp;nbsp; JJK&lt;/P&gt;</description>
    <pubDate>Tue, 14 Nov 2017 14:59:24 GMT</pubDate>
    <dc:creator>JJK</dc:creator>
    <dc:date>2017-11-14T14:59:24Z</dc:date>
    <item>
      <title>Integer vs unsigned integer performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Integer-vs-unsigned-integer-performance/m-p/1156630#M78833</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Long post, please bear with me ;)&lt;/P&gt;

&lt;P&gt;Last weekend at a Hackathon we were discussing the performance of signed vs unsigned integers; at an HPC workshop someone mentioned to me that you should use "int" instead of "unsigned" or "size_t", as 'int' would be much faster.&amp;nbsp; Of course, nobody believed this, so I ended up cobbling together a piece of code to test it. The code is based on a (very dumb) example to determine whether a number is prime or not:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;int isPrime(MyIntegerType n)
{
  if (n &amp;lt;= 1) return 0;
  for (MyIntegerType i = 2; i &amp;lt; n; i ++)
  {
     if (n % i == 0) return 0;
  }
  return 1;
}&lt;/PRE&gt;

&lt;P&gt;&lt;BR /&gt;
	I realize that this algorithm is very bad (and not even 100% correct, as one can aruge whether '2' is prime or not), and I realize that a compiler needs to do bound checking and that bounds checks are different for signed and unsigned integers. However, this is about the performance of using signed vs unsigned, nothing else.&lt;/P&gt;

&lt;P&gt;With a wrapper around the above code I&amp;nbsp; ended up with a (64 bit) test program that tests this using&lt;BR /&gt;
	&amp;nbsp; MyIntegerType = {'int', 'unsigned', 'long', 'unsigned long', 'size_t' }&lt;BR /&gt;
	The results are quite surprising:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;It does not really matter which compiler you use for this; gcc 4.8, gcc 7 and icc&amp;nbsp; generate almost identical assembly code, but the assembly code for 'int' &lt;B class="moz-txt-star"&gt;&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;IS&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;&lt;/B&gt;&amp;nbsp; different from the code generated for 'unsigned int'&lt;/LI&gt;
	&lt;LI&gt;A good old Harpertown CPU can keep up in this test with CPUs that have a hi\gher clock speed and are 4+ years newer&lt;/LI&gt;
	&lt;LI&gt;Performance differs &lt;EM&gt;per hardware platform &lt;/EM&gt;using the same binary&lt;/LI&gt;
	&lt;LI&gt;On Ivy Bridge, Haswell and Broadwell CPUs it makes sense to use 'int' instead of 'unsigned int'&lt;/LI&gt;
	&lt;LI&gt;On &lt;STRONG&gt;KNL (+KNC)&lt;/STRONG&gt;&amp;nbsp; and Atom based CPUs it's exactly the other way round: it makes sense to use 'unsigned' instead of 'int'&lt;/LI&gt;
	&lt;LI&gt;On all others there is no significant difference between using or the other&lt;/LI&gt;
	&lt;LI&gt;There is a &lt;B class="moz-txt-star"&gt;&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;huge&lt;SPAN class="moz-txt-tag"&gt;*&lt;/SPAN&gt;&lt;/B&gt; penalty for using "long" on Intel CPUs&lt;/LI&gt;
	&lt;LI&gt;Performance of the KNL box is surprisingly low, even given the fact that it runs at 1.3 or 1.5 GHz - compared to e.g. a 3.3 GHz Pentium G4400&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Attached is a spreadsheet with all the platforms it was tested on so far.&lt;/P&gt;

&lt;P&gt;Now for my questions:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;can someone explain the difference in execution time between different platforms , given the fact that the same executable was used on all platforms (hence no compiler differences) ?&lt;/LI&gt;
	&lt;LI&gt;can someone explain why there is such a huge performance penalty for 64bit ints vs 32bit ints (and note that CPUs made by a certain rival company do not display this behaviour)&lt;/LI&gt;
	&lt;LI&gt;how can a programmer know up-front what will be the best type for a given algorithm+CPU?&lt;/LI&gt;
	&lt;LI&gt;why is performance on the KNL so bad compared to the rest ?&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Source code and/or binaries for the above test results is available upon request.&lt;/P&gt;

&lt;P&gt;Thx,&amp;nbsp; JJK&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2017 14:59:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Integer-vs-unsigned-integer-performance/m-p/1156630#M78833</guid>
      <dc:creator>JJK</dc:creator>
      <dc:date>2017-11-14T14:59:24Z</dc:date>
    </item>
    <item>
      <title>This: http://www.agner.org</title>
      <link>https://community.intel.com/t5/Software-Archive/Integer-vs-unsigned-integer-performance/m-p/1156631#M78834</link>
      <description>&lt;P&gt;This: &lt;A href="http://www.agner.org/optimize/instruction_tables.pdf"&gt;http://www.agner.org/optimize/instruction_tables.pdf&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;may shed some light on the issue.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2017 16:33:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Integer-vs-unsigned-integer-performance/m-p/1156631#M78834</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2017-11-14T16:33:10Z</dc:date>
    </item>
  </channel>
</rss>

