<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Intel Fortran Optimizations in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754910#M10397</link>
    <description>&lt;DIV style="margin:0px;"&gt;I will keep these in mind. Thanks everyone for your time.&lt;/DIV&gt;</description>
    <pubDate>Tue, 28 Apr 2009 05:53:19 GMT</pubDate>
    <dc:creator>fivos</dc:creator>
    <dc:date>2009-04-28T05:53:19Z</dc:date>
    <item>
      <title>Intel Fortran Optimizations</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754906#M10393</link>
      <description>Hi everyone,&lt;BR /&gt;I am using Intel Fortran Compiler 11 for a CFDalgorithm and I am interested to make it as fast as possible, with the least impact on accuracy or stability. So I have improved the algorithm as much asI could in order to eliminate bottlenecks and make it faster, and used OpenMP for parallelism atthe most computationally heavy do-loops. What I am l looking for is suggestions for the compiler optimization flags. I have used the -fast flag but the algorithm turned to be a bit unstable at certain cases. On the other hand -O3 flag seems to work well. Apart from these what else can I use to speed up the program? &lt;BR /&gt;The CPU on which the program will run is Quad Core Xeon E5405, operating system linux 64-bit. Also I tried using the &lt;BR /&gt;-xsse4.1 flag, since Xeon E54XX supports sse4.1,but it is not recognised at all by the compiler. To be precise I get :&lt;BR /&gt;[foivos@hpc25 test]$ ifort -xsse4.1 -openmp -O3 -oSPHo.exe SPHfast.for&lt;BR /&gt;ifort: command line warning #10130: unknown extension 's' ignored in option '-x'&lt;BR /&gt;ifort: command line warning #10130: unknown extension 's' ignored in option '-x'&lt;BR /&gt;ifort: command line warning #10130: unknown extension 'e' ignored in option '-x'&lt;BR /&gt;ifort: command line warning #10130: unknown extension '4' ignored in option '-x'&lt;BR /&gt;ifort: command line warning #10130: unknown extension '.' ignored in option '-x'&lt;BR /&gt;ifort: command line warning #10130: unknown extension '1' ignored in option '-x'&lt;BR /&gt;etc... (compilation continues)&lt;BR /&gt;Will -xT be of any help in my case (since it is intended for IA-32 bit applications only)?&lt;BR /&gt;&lt;BR /&gt;Any help, ideas, suggestions are appreciated&lt;BR /&gt;Thanks in advance.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 27 Apr 2009 11:55:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754906#M10393</guid>
      <dc:creator>fivos</dc:creator>
      <dc:date>2009-04-27T11:55:17Z</dc:date>
    </item>
    <item>
      <title>Re: Intel Fortran Optimizations</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754907#M10394</link>
      <description>Well, there is a lot of options you can play with.&lt;BR /&gt;&lt;BR /&gt;Short guide:&lt;BR /&gt;&lt;A href="http://cache-www.intel.com/cd/00/00/22/23/222300_222300.pdf" target="_blank"&gt;http://cache-www.intel.com/cd/00/00/22/23/222300_222300.pdf&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Long guide:&lt;BR /&gt;&lt;A href="http://cache-www.intel.com/cd/00/00/40/60/406091_406091.pdf" target="_blank"&gt;http://cache-www.intel.com/cd/00/00/40/60/406091_406091.pdf&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Generally, the most useful ones are: -O2 -O3, -ipo (-ip), -xhost&lt;BR /&gt;&lt;BR /&gt;These are might be worth trying, too: -ftz -fno-alias -fno-fnalias -align all -IPF-fp-relaxed -funroll-all-loops&lt;BR /&gt;&lt;BR /&gt;Sergiy&lt;BR /&gt;</description>
      <pubDate>Mon, 27 Apr 2009 13:36:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754907#M10394</guid>
      <dc:creator>bubin</dc:creator>
      <dc:date>2009-04-27T13:36:20Z</dc:date>
    </item>
    <item>
      <title>Re: Intel Fortran Optimizations</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754908#M10395</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
Spell it as -xSSE4.1. Case matters. If you want to use the older version, -xS would be the equivalent. -xT is SSSE3.&lt;BR /&gt;</description>
      <pubDate>Mon, 27 Apr 2009 13:38:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754908#M10395</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2009-04-27T13:38:26Z</dc:date>
    </item>
    <item>
      <title>Re: Intel Fortran Optimizations</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754909#M10396</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/336209"&gt;Steve Lionel (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; Spell it as -xSSE4.1. Case matters. If you want to use the older version, -xS would be the equivalent. -xT is SSSE3.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
-msse4, -msse4.1, and -msse3 worked when I tried them. I guess the difference between them and the -x versions is the latter should give you a screen message when quitting on account of unrecognized CPU type. Don't count on it, please.&lt;BR /&gt;I haven't seen a CFD code which depended on complex arithmetic, so there won't necessarily be an advantage in changing from default (-msse2) to -msse3 or -xSSSE3. Depending on coding practices, sse4.1 may have an advantage.&lt;BR /&gt;The CPU architecture option choice is not tied with your choice to run 32-bit. On the other hand, people generally use 64-bit mode for CFD applications; only very small jobs could normally run faster in 32-bit mode.&lt;BR /&gt;Those interested in stability would normally set -prec-div -prec-sqrt -assume protect_parens, unless a performance loss can be associated with one of those options. This is partly a coding practices question as well. For example, if the programmer has the habit of writing /2 rather than *0.5 or *(1/2.), -prec-div may cost performance. &lt;BR /&gt;-prec-div is quite literal about not allowing division to be replaced by multiplication, not distinguishing between those cases where the result can't change and those where the substitution is risky. The no-prec-div option is a default presumably for historical reasons, as some past Intel CPUs didn't have competitive division performance.&lt;BR /&gt;-assume protect_parens requires the compiler to follow the Fortran standard on parentheses. In correctly written code, this may improve performance.&lt;BR /&gt;</description>
      <pubDate>Mon, 27 Apr 2009 13:43:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754909#M10396</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-04-27T13:43:00Z</dc:date>
    </item>
    <item>
      <title>Re: Intel Fortran Optimizations</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754910#M10397</link>
      <description>&lt;DIV style="margin:0px;"&gt;I will keep these in mind. Thanks everyone for your time.&lt;/DIV&gt;</description>
      <pubDate>Tue, 28 Apr 2009 05:53:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Intel-Fortran-Optimizations/m-p/754910#M10397</guid>
      <dc:creator>fivos</dc:creator>
      <dc:date>2009-04-28T05:53:19Z</dc:date>
    </item>
  </channel>
</rss>

