Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28894 Discussions

Programs compiled with ifx are slower than compiled with ifort.

eliopoulos
Novice
2,745 Views

Programs compiled with ifx are slower than compiled with ifort. Is this going to change with the future updates? Speed is very important to me and now ifort has been discontinued.

27 Replies
eliopoulos
Novice
918 Views

Here they are.

0 Kudos
Ron_Green
Moderator
917 Views

Great.  Thanks, I will keep working on it today.

0 Kudos
Ron_Green
Moderator
851 Views

Try this:  add option /Qipo

that is under Optimization, Interprocedural Optimization

 

I had to edit nlay.txt.  the nl set in that was 56, but laminate only has 55 lines.  I changed nl to 55.  This got the code running.

 

I am seeing strange results that I have yet to understand. 

I modified the code to stop after the  ( passes > 21.0 ) so it runs in about 36 to 29 seconds. 

On linux I see this

ifort -i8 -O2 -xhost -qmkl FADAS_C1_01.for  ; time ./a.out 

  time:  26.85 seconds

 

ifx -i8 -O2 -xhost -qmkl FADAS_C1_01.for  ; time ./a.out 

   time:  29.53 seconds

 

which is about what you are seeing.  slower ifx.

Now I add in -ipo to ifx

ifx -i8 -O2 -xhost -qmkl FADAS_C1_01.for  ; time ./a.out 

   time: 21.86 seconds

 

Now, I try adding ipo to ifort, but it HURTS ifort.

  time: 28.68 seconds

 

IPO helps with inlining, which can also affect vectorization.  The Linux server I have is a i5-7600 Kaby Lake, 7th Gen Xeon.

IFX does NEED -ipo since it is not enabled by default.  Ifort DOES interprocedural optimization BY DEFAULT at O2 within a source file.  Hence, it was getting good inlining and optimal code with just O2.  IFX needs it enabled explicitly with the -ipo option or the -flto option (same as).

 

So much for Linux.  I got onto an Windows 11 server with VS 2022.  It's older, 6th Gen Xeon Skylake

I go to the command line first  I build with similar Windows options

 

ifx  /integer-size:64 /Qxhost /O2 /Qmkl  FADAS_C1_01.for

   As expected, it's slower 

    36.2 seconds

 

ifort with seam, no IPO  33 seconds.  faster than ifx

 

Add /Qipo and ifx runs in 29.8 seconds.  Again fastest.

 

Now the ODD part.  I go back to VS and build your project with ifort and ifx.  Oddly the ifort and ifx builds default are 36 seconds.

I add /Qipo to ifx and ifort BUT still getting 36 seconds!  I can't get the code to budge!  Now the Windows build uses /threads and multithreaded debug libs.  I tried threaded static and same, can't get improvement.

 

So why don't you give it a try in VS.  Properties Fortran -> Optimization -> Interprocedural Optimization to set /Qipo

What do you see?

Also, what is your host CPU?  Hopefully Genuine Intel.  

0 Kudos
eliopoulos
Novice
817 Views

I don't understand why your laminate.txt file has 55 lines. The file I uploaded has 56. I run the program on a laptop with an Intel core i7 1260P CPU, Windows 11 Pro and VS 2022. My respective times are:

ifx built with VS and /Qipo

time: 24.922 s

ifx built with command line and /Qipo

time: 18.719 s

ifort built with VS and without /Qipo

time: 14.016 s

My laptop appears to be faster than your servers. ifort remains faster.

0 Kudos
Umar__Sait
Novice
661 Views

Similar slowdown seen on our nuclear TDDFT code, most likely due to double precision complex arithmetic: Stats are:

 

IFX version 2025.0 with -O3 -Xhost -ipo: 145m4.824s

Ifort version 2024.2 with -O3 -Xhost:        125m17.952s

 

code used openmp.

0 Kudos
Andy59
Beginner
588 Views

In my case, it is the compiling process for which using the `ifx` is much slower than using the legacy `ifort`. I am using the VS 2022 with Intel OneAPI. Anyone happens to know the reason for that?

0 Kudos
Ron_Green
Moderator
171 Views

@Andy59 I have a suspicion about the slow compilation time.  First, is the ifort version 2021.13, from the oneapi 2023.2 package?  And what version is ifx?

Also, could you share the compilation times for both compilers for the source file where you see the difference?

 

My suspicion is initializers.  Does your code have:
1) DATA statements? or

2) Initialization in type declarations?   Like

    real, dimension(3)  ::  point_in_space = [ 0.0, 0.0, 0.0 ]

with a lot of array constructors or type constructors?

or 

3) just a LOT of iniitializations :

     real  :: var1 = 42.0

     real :: var2 = 42.0

...

      real :: var1000000 = 42.0


We have been working on initializers lately.  

 

Another possible place, though less likely, is in deeply nested USE trees. 

 

0 Kudos
Reply