Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28383 Discussions

Hesitant to renew IVF support service

Greynolds__Alan
Beginner
717 Views
I'm hesitant to renew my IVF support service for several reasons:

1. I'm still using version 11 for my day-to-day because of some weird optimization problems in version 12 that causes erroneous code to be generated when vectorization is involved. Unfortunately I've never been able to isolate the problems in small test cases and the full code is large and proprietary.
2. My main application is essentiallystandard Fortran-95 and OpenMP so I don't need all the new Fortran-200X stuff.
3. I'm evaluating the PGI compiler because of its support for GPU programming via the tentative OpenACC standard. It doesn't appear that Intel is going to support this in the near future.

I would renew if I'm wrong about 3 and OpenACC GPU programming is coming within the next year.

Al Greynolds
www.ruda.com
0 Kudos
12 Replies
TimP
Honored Contributor III
717 Views
OpenACC is a partial interim standard to cover only specific types of GPU. Intel is working on the committee to merge those facilities into future OpenMP. Needless to say, "within a year" is too short a period to expect that follow-on standard implementation.
0 Kudos
Greynolds__Alan
Beginner
717 Views

So far myevaluationof thePGI compiler has been a bust. Its performance on my multi-threaded (OpenMP) engineering applicationis very poor. On that same subject, that leads to another reason Imight notbe renewing my Intel support:

4. Running on both my SSE 4.1 and SSE 4.2 capable machines, my applicationwent from being 25% slower using Gfortran 4.5 than IVF 11.1 to 10% faster using Gfortran 4.7 than IVF 11.1 or 12.1 (in all cases I played with optimization and code generation settings to get maximum performance).

Al Greynolds
www.ruda.com

0 Kudos
Steven_L_Intel1
Employee
717 Views
Al, can you provide us the program you used to compare? While gfortran continues to improve, our own testing as well as independent tests show it to be, in general, considerably slower than Intel Fortran.
0 Kudos
bmchenry
New Contributor II
717 Views
a good place to start for comparisons of relative speeds of compilers is:
http://www.polyhedron.com/compare0html
from that link the related pages indicate GFortran was only faster than INTEL on two tests (AC & TFFT2) and only on the AMD processor.
Differences were 9.17 v 9.73 (AC) and 133.76 v 135.2 (TFFT2)
Look at the test results for everything else and INTEL has the green GO light!
I find my own tests are consistent.
So with that in mind I'd love to see what you are doing to produce your specified results?

0 Kudos
Greynolds__Alan
Beginner
717 Views
My real-world tests are specifically for multi-threaded OpenMP performance which the Polyhedron benchmarks do not cover. They are also using "old" versions of Gfortran. BTW, I too am suprised by my most recent runs that show Gfortran passing IVF.

Al
0 Kudos
Steven_L_Intel1
Employee
717 Views
You may be seeing some other effect. Would you be willing to provide us with your tests? We'd like to investigate this.
0 Kudos
Greynolds__Alan
Beginner
717 Views
The code very sensitive/proprietary so the best I can do is the following:

Attached is a published paper I presented at a technical conference in August 2011. Also attached are the results on the same machine for the latest compilers which are actually the Mac OSX versions so they can be directly compared with Figure 6 in the paper (the Windows 7 results are negligibly different). Notice that the speed of the Intel versions has not changed much in the last year but there is a significant jumpingfortran performance. One change in gfortran 4.7 was to put, by default, local arrays on the stack instead of the heap (I think this has always been the default with Intel Fortran).

Al
0 Kudos
Steven_L_Intel1
Employee
717 Views
Thanks - we'll see what we can do with this. Would you please tell me which options you used for each compiler?
0 Kudos
TimP
Honored Contributor III
717 Views
The ifort option /Qauto (implied by several others, such as /Qopenmp) puts local arrays on the stack (unless /heap-arrays is set).
In my comparisons between ifort and gfortran I always had OpenMP enabled. Compared with current gfortran, ifort frequently depends on the vectorization or prefetch directives to give better performance.
0 Kudos
Greynolds__Alan
Beginner
717 Views
Here arethe options I used (besides the particular OpenMP option):

gfortran: -O3 -funroll-loops -march=native -ffast-math -mfpmath=sse -msse4.2
ifort: -fast

I experimented with additional ifort options (e.g. -unroll-aggressive) but no combination produced faster code. Any suggestions?

Al
0 Kudos
TimP
Honored Contributor III
717 Views
You might try something closer to equivalent to your gfortran options: ifort -O2 -assume protect_parens -xSSE4.1 -complex-limited-range. Evidently, if you have complex arithmetic, it's not a fair comparison when you set limited-range in gfortran (it's included in -ffast-math) but not ifort. If you don't have complex, it's not so obvious why gfortran would be faster.
I saw that you set -march-native and made doubly sure by -msse4.2 (I suppose you don't have AVX), but I don't think gfortran loses any Westmere optimizations by enabling sse4.2. (Just in case you're testing a Westmere-like CPU).
0 Kudos
Greynolds__Alan
Beginner
717 Views

For these particular results, the algorithm does not use complex arithmetic. Also, the processor is a Westmere so it doesn't have the new AVX.

Al

0 Kudos
Reply