Optimization of fortran code using parallel studio xe



I'm now trying to find the best compilation flags for my code (CFD) since a couple of weeks but I think maybe thing wrong because the behaviour of my code is wired.

My code is written in fortran (77/95) and I use OpenMPI for parallelisation.  According to the Intel site of the CPU I use SSE4.2 can be used for optimization! (, therefore I compiled OpenMPI with the following command:

./configure --prefix=/opt/OpenMpi_intel_Opt_static/ CC=icc CXX=icpc FC=ifort CFLAGS="-msse4.2 -axsse4.2" CXXFLAGS="-msse4.2 -axsse4.2" FFLAGS="-msse4.2 -axsse4.2" FCFLAGS="-msse4.2 -axsse4.2" LDFLAGS="-msse4.2 -axsse4.2" --with-platform=optimized --disable-shared --enable-static

based on (

My code is than compiled with the following:

mpif90 -c -axsse4.3 -O3    files.f

mpif90 -o prog all.o -axsse4.2 -O3 

It seems, that based on the size of my arrays (in my case the number of entries of the array are equal my domain size) I get a good result or just "NaN". And when I get "NaN" the following remark is shown during compilation:

MAIN__ has been targeted for automatic cpu dispatch

If I delete the "-axsse4.2" flag it worked fine but It takes longer!

Is there another way to optimize my code or change the compilation flags in order to decrease the runtime?




EDIT: mistyped -axsse4.2

The question looks more suitable for the Intel Linux Fortran forum.  But it's difficult to see what you're trying to accomplish with all this confusion.  Why not use -msse4.2 throughout?  There's no sse4.3 option for intel compilers.  If it's not rejected we can't guess what will happen.

