Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.

Performance of Ifort and Ifx

David_DiLaura1
New Contributor I
5,538 Views

Having (finally) correctly installed oneAPI 2024, we compared the performance of code compiled with Ifort and Ifx. The ratio of execution times (on the order of several minutes to half an hour, depending on the machine) of our commercial ray-based radiative analysis code compiled with Ifx to that compiled with Ifort. is  Ifx/Ifort = 1.38.  Quite a difference, with Ifx producing code that is significantly slower. 

 

Are other users seeing the same thing?

 

This is across a range of machines; from a Surface Book that provides 8 threads, to Dell workstations that provide 20 and 40 threads.

Context: Heavy threading usage, large arrays, 100s of millions of dot products and cross products, extensive array accesses, aside from sqrt() little access to intrinsic functions, almost no disk I/O, machines equipped with at least 32 Gb RAM.  The compiler settings are the same for Ifort and Ifx:

David_DiLaura1_0-1701725682581.png

David_DiLaura1_1-1701725708422.pngDavid_DiLaura1_2-1701725753419.pngDavid_DiLaura1_3-1701725776032.png

 

 

 

 

0 Kudos
12 Replies
Umar__Sait
Novice
5,521 Views

Just did a similar study using our scientific code with similar concusion.

Machine: Dual processor Intel(R) Xeon(R) Gold 6256 CPU @ 3.60GHz with hyperthreading tuned off., 96GB memory, the program using 78% of the memory.

Machine was idle during both runs.

Flags: -no-prec-div -O3 -fp-model=fast=2 -xHost -qopenmp

Times are (from linux time command):

ifx:

real    198m59.309s
user    3551m16.702s
sys     15m45.309s

ifort:

real    153m43.612s
user    2797m44.702s
sys     10m37.756s

This has been the xperience with older versions of ifx as well. The compilation is much faster with ifx.....

 

0 Kudos
Ron_Green
Moderator
5,510 Views

I am curious - is your data REAL or COMPLEX?  Or perhaps INTEGER? 

 

For the OpenMP evaluation, did you set number of threads=1 to determine if it's the single-core performance drop, or due to thread scaling? 

 

I would also advise adding to ifx options

-align array64byte  -flto 

 

 

0 Kudos
Umar__Sait
Novice
5,507 Views

I have complex/real mostly. All double precision. I will try these options tomorrow as well as the thread dependemce.

0 Kudos
David_DiLaura1
New Contributor I
5,484 Views

Our data is virtually all real(4).  A several single dimension, integer(4) vectors to point to surface vertex count, surface order, etc.

Setting the thread count == 1 made the ratio higher:   Ifx/Ifort 1.45.  Threading seems to actually hide a bit of the performance drop.

The options -align array64byte  -flto   had no reliably timeable effect on the result.

 

All of this is worrying, given that Intel plans to move away from Ifort within a year.

0 Kudos
martinmath
New Contributor I
5,462 Views

I can report a pretty consistent performance decrease of 5% to 20% (from ifort to ifx) across a number of quite different kinds of workloads (fluid dynamics solver), but only without openmp and only one thread, as otherwise ifx still crashs. Worksloads are typical sparse matrix linear solver operations, combinatorial and geometrical algorithm like (re)meshing with mixed integer/real data, high level object organisation etc.) The only exception is IO, which is a lot faster in ifx than ifort. As I can currently check with one thread I have not dug any deeper. It looks on par with gfortran, which was always a bit slower than ifort. However, the 20% was in the linear solver code (mainly data fetch, dot_product and axpy).

0 Kudos
Umar__Sait
Novice
5,440 Views

I am not sure about these llvm based compilers because I am seeing the same thing with AMD compilers on AMD CPUs.. Intel ifort runs 30% faster on AMD threadripper than AMD's latest compilers!

0 Kudos
Ron_Green
Moderator
5,423 Views

Complex data is a problem area for all llvm-based compilers.  The framework was/is mostly written with C++ and similar other languages in mind.  They do not have an intrinsic complex type like we do in Fortran.  For ifort, we spent years making our vectorizer/optimization phases and code gen work well for Fortran complex data types.  We are working on our llvm opts/vect phases to see how we can improve complex data - and we are tracking bugs on this know problem area.  So work is underway on this shortcoming.

 

Similarly, we do have bugs open on a number of performance 'gaps' between ifx and ifort.  Obviously we have had to put priority to getting ifx to actually compile codes correctly and this work is still ongoing.  But on this front there is light at the end of the tunnel.  Our incoming issues on ifx for ICEs and wrong code gen are decreasing over time, and this is while we have an uptick in ifx adoption.  So I am optimistic that we are stabilizing ifx very quickly.

 

Performance will improve over time.  Ifort wasn't birthed with amazing performance.  it came over time.  Ifx will be no different.  Same for any of the llvm-based compilers.  improvements will float many boats.

 

 

0 Kudos
Umar__Sait
Novice
5,413 Views

Thanks for the explanation. I hope these get resolved in time. In the mean time we would have to continue using ifort because our program runs a very long time (3 days to a week on the above processors with efficient openmp for some runs) so 30% could mean a day or two difference. I hope ifort, even if it is depreciated, is continued to be distributed with new versions of oneapi. Otherwise a separate ifort package would be nice.

0 Kudos
bwe
Novice
5,327 Views

Same here, only worse: our program is over twice as slow when compiled with ifx. The execution time ratio is 2.2 (!!).

 

The hot loop of our program involves computations with complex numbers, so I assume that's why based on Ron Green's comment.

 

Very disappointing that Intel would sunset its old compiler before the new one can keep up.

 

Edit: This seems like it will be devastating for us, as our code needs to run faster than real time, and one set of our inputs was already only running around ~2x real time. Unfortunately the ifx version of the executable currently crashes on those inputs so I can't tell if we're slightly below (probably) or slightly above real time.

0 Kudos
jimdempseyatthecove
Honored Contributor III
3,590 Views

FWIW

I suspect that your highest compute procedure(s) are fully worked out and will not require revisions. So, you can build a performance library using ifort containing those procedures and then link that into the ifx code that is evolving. Then whenever ifx gets up to speed, switch back to including those procedures into your compile lists.

 

Jim Dempsey

0 Kudos
NichoalsKouwen
Beginner
3,631 Views

I run a hydrological model with ~20% time spent i/o, simple math, lots of logic (wet vs. dry etc.) My ratio ifx/ifort= 1.186

Many optimization runs (parameter fitting) with mpiexex (not timed) taking 3-4 weeks so 20 days becomes 24

 

0 Kudos
prop_design
New Contributor II
3,506 Views

I recently had a positive experience with ifx. I compare gfortran, flang, ifort, and ifx regularly. ifx was the only compiler that had a constant precision. All of the others would have different precision, depending on optimization level. You can observe this behavior if you have a code that iterates to a certain level of precision. The look at how many tries it takes to get there. ifx was the only compiler to take the same amount of tries, regardless of optimization level.

 

My results were ifx was always faster than ifort:

 

max; ifx is 21.8% faster than ifort

min; ifx is 2.91% faster than ifort

average; ifx is 9.59% faster than ifort

 

for the way i distribute my codes ifx is 6.16% faster than ifort

 

the range above is due to 15 different compiler option tests that i do. if you always do one set of compiler options, then you wouldn't have the range that I'm showing.

 

one of the intel employees, on this forum, previously told me to add /flto /fuse-ld=lld /align:array64byte and it helped ifx. it had no affect on ifort. with an earlier version of ifx and none of the mentioned options, ifx was slightly slower than ifort.

 

the complete benchmark is here; https://propdesign.jimdofree.com/fortran-benchmarks/

 

i think for such a new compiler ifx is very good. it's way better than flang, which it is based on. so that's rather odd. in fact, both ifort and ifx are faster than gfortran and flang, with no optimizations what so ever and gfortran and flang fully optimized. so Intel is doing something special.

 

i don't have any complex number calculations though. so as others have said, perhaps that has something to do with it.

 

one thing that does suck is installation. i have had to completely redo windows at least twice. because the compiler installation was so bad. i always completely uninstall visual studio and ifort, before installing a new version. using the update functionality has not worked out well for me. typically, that is one of the reasons it nukes windows. not sure what the other reasons might be. on the other hand, updating gfortran and flang is very easy, using MSYS2. so it would be nice if Intel could somehow have an easier install and update process. one that didn't require uninstalling visual studio. i don't use visual studio. i only install it for intel fortran. so that makes things take forever. once i do get everything installed, i don't update any of it. visual studio or intel fortran. ideally, if intel fortran didn't require visual studio, that would be even better in my case. i know that wouldn't be good for others though. so perhaps an option that didn't require it. i believe they had something like that a long time ago. where the intel package would install what was required from visual studio (for those that only used the command prompt). visual studio is a huge download and takes the most time to install. when you don't even use it, it sucks to have to do all the time.

Reply