Optimization of fortran code using parallel studio xe

Christoph_I_ · ‎06-21-2016

Hello,

I'm now trying to find the best compilation flags for my code (CFD) since a couple of weeks but I think maybe thing wrong because the behaviour of my code is wired.

My code is written in fortran (77/95) and I use OpenMPI for parallelisation. According to the Intel site of the CPU I use SSE4.2 can be used for optimization! (http://ark.intel.com/products/52576/Intel-Xeon-Processor-X5690-12M-Cache-3_46-GHz-6_40-GTs-Intel-QPI), therefore I compiled OpenMPI with the following command:

./configure --prefix=/opt/OpenMpi_intel_Opt_static/ CC=icc CXX=icpc FC=ifort CFLAGS="-msse4.2 -axsse4.2" CXXFLAGS="-msse4.2 -axsse4.2" FFLAGS="-msse4.2 -axsse4.2" FCFLAGS="-msse4.2 -axsse4.2" LDFLAGS="-msse4.2 -axsse4.2" --with-platform=optimized --disable-shared --enable-static

based on (https://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers).

My code is than compiled with the following:

mpif90 -c -axsse4.3 -O3    files.f

mpif90 -o prog all.o -axsse4.2 -O3

It seems, that based on the size of my arrays (in my case the number of entries of the array are equal my domain size) I get a good result or just "NaN". And when I get "NaN" the following remark is shown during compilation:

MAIN__ has been targeted for automatic cpu dispatch
....

If I delete the "-axsse4.2" flag it worked fine but It takes longer!

Is there another way to optimize my code or change the compilation flags in order to decrease the runtime?

THX

EDIT: mistyped -axsse4.2

TimP · ‎06-21-2016

The question looks more suitable for the Intel Linux Fortran forum. But it's difficult to see what you're trying to accomplish with all this confusion. Why not use -msse4.2 throughout? There's no sse4.3 option for intel compilers. If it's not rejected we can't guess what will happen.