Re: -init compilation flag canceled vectorization

nadavhalahmi · ‎05-24-2021

I wanted to compare array initialization run time for ifort vs gfortran using this compilation lines with gfortran 10.1.0 and ifort 19.1.3.304 on CentOS Linux 7:

ifort array-initialize.f90 -O3 -init=arrays,zero,minus_huge,snan -g -o intel-array.out

gfortran array-initialize.f90 -O3 -finit-local-zero -finit-integer=-2147483647 -finit-real=snan -finit-logical=True -finit-derived -g -o gnu-array.out

array-initialize.f90:

program array_initialize
implicit none

integer :: i, j, limit
real :: my_max
real :: start, finish

my_max = -1.0
limit = 10000

call cpu_time(start)
do j=1, limit
do i=1, limit
my_max = max(my_max, initializer(i, j))
end do
end do
call cpu_time(finish)

print *, my_max
print '("Time = ", f6.3," seconds.")', finish-start

contains
function initializer(i, j)
implicit none
real :: initializer
real :: arr(2)
integer :: i, j

arr(1) = -1.0/(2*i+j+1)
arr(2) = -1.0/(2*j+i+1)
initializer = max(arr(1), arr(2))
end function
end program array_initialize

Run times for this code:

gnu - 0.096 sec

intel - 0.392 sec

When I remove the init flags:

gnu - 0.098 sec

intel - 0.057 sec

When I replace the array with two variables:

gnu - 0.099 sec

intel - 0.065 sec

-----------------------------

Then, I added `-no-vec` compiltation flag for intel and `-fno-tree-vectorize` for gnu, and got these run times:

Run times for the code above:

gnu - 0.393 sec

intel - 0.395 sec

When I remove the init flags:

gnu - 0.391 sec

intel - 0.393 sec

When I replace the array with two variables:

gnu - 0.391 sec

intel - 0.395 sec

------------------

It seems like ifort's bug. It occurred to me to compare array initialization run time when I compiled with `-auto` compilation flag for ifort too (in addition to `-init`) and noticed a major part of the run time of my application as seen on Intel Vtune is spent on `for_array_initialize` function, which wasn't the case on gfortran.

jimdempseyatthecove · ‎05-24-2021

A couple of points about making timing runs:

1) structure your test such that the timed section is contained within a loop whereby you can disregard the first iteration (or keep it separate from the remainder iterations).

2) When the timed section is less than a few seconds, nest it in a loop such that it runs a few seconds (then divide the resultant time by the iteration count to get the run time).

Note, a single pass (of timed section) of very short runtime section may encounter O/S and runtime initialization overhead.

The '-no-vec' runtimes of 4x/7x seem peculiar in that your array has only 2 elements.

Note, the optimizer should have seen that the entirety of the array arr was explicitly written as well as its lifetime ended at return of function,... thus the initialization should have been elided as well.

Jim Dempsey

Barbara_P_Intel · ‎05-24-2021

As confirmed by the optimization report, there is no vectorization when you use -init.

According to the Fortran Developer Guide, "To avoid possible performance issues, you should only use [Q]init for debugging (for example, [Q]init zero ) and checking for uninitialized variables (for example, [Q]init snan)."

There are other cautions with using -init, too. See the Fortran Developer Guide.

And everything @jimdempseyatthecove said about timing when the app runs so fast is SO TRUE! The OS gets in the way at that point.

nadavhalahmi · ‎05-25-2021

I see what you @Barbara_P_Intel mean. But I have a huge legacy code which is simply too risky to run without `-init` to `snan`. When I remove the `-init=arrays` flag, I get about 25% improvement in run time which is very significant in my case. What is my alternative?

Even with the performance issue, I still expected some optimizations like vectorization, use of two variables instead of array or even disable the initialization as @jimdempseyatthecove said when it is clear that the array is not accessed before initialized.

I will surely fix the timings thanks to @jimdempseyatthecove, but I still see it as a compiler bug since gnu handle it very well.

andrew_4619 · ‎05-25-2021

I do not think I would call that a bug. The init is a debug tool and can be used as a kludgy fix for buggy code. Some disabling of optimisation is a result. The real answer would be to fix the code but if that is too big a job I guess you are stuck with the consequences. Sorry I can't give helpful suggestions.

Barbara_P_Intel · ‎05-25-2021

There's a lot of legacy code out there where programmers got dependent on compiler options like -init. Unfortunately, with newer technology and methods to improve performance there are drawbacks.

You could run VTune profiler on the application with various datasets, determine the hot spots, remove -init from the "hot" routines where the arrays are already initialized and see what happens when you run the Q/A tests.

-init compilation flag canceled vectorization

Fortran Language

Performance