I have a code which iteratively converges to a solution and if the maximum iterations are reached, it prints "MAX IT ERROR"
I am using BLAS routines and have linked them using Intel Link Line Advisor
Now when I use gfortran I get
gfortran FILENAME.f90 -L$MKLROOT/lib/ia32 -lmkl_blas95 -Wl,--start-group -lmkl_gf -lmkl_gnu_thread -lmkl_core -Wl,--end-group -liomp5 -lpthread && time ./a.out
But When I Use
ifort FILENAME.f90 -L$MKLROOT/lib/ia32 -lmkl_blas95 -Wl,--start-group -lmkl_gf -lmkl_gnu_thread -lmkl_core -Wl,--end-group -liomp5 -lpthread && time ./a.out
It prints MAX IT ERROR
The 55767 in the above output is the number of iterations needed. The max is set at 160000
Not only this, but even during other simulations, at times, there is a difference in the number of iterations when there is absolutely NO change in the code or linking libraries.
Am I missing some flag ?
Firstly, it was unable to find those files even though they were present in that folder. So I simply copy pasted them to my working directory. I tried :
ifort FILENAME.f90 -L$MKLROOT/lib/ia32 -lmkl_blas95 -Wl,--start-group libmkl_intel.a libmkl_intel_thread.a libmkl_core.a -Wl,--end-group -liomp5 -lpthread && time ./a.out
ifort FILENAME.f90 -L$MKLROOT/lib/ia32 -lmkl_blas95 -Wl,--start-group libmkl_intel.a libmkl_intel_thread.a -lmkl_core -Wl,--end-group -liomp5 -lpthread && time ./a.out
(All .a files were pasted into the working directory)
That is an indication that an array is being overrun, or uninitialized variables are being used in the calculations, or variables that are expected to retain values do not have the SAVE attribute specified. There are other possible causes, but first suspicion falls on these.
Please post the code if it is reasonably sized. If not, consider preparing a pared-down version of the code.
if i run
gfortran blah blah && ./a.out a million times I will get the same answer (and correct one)
however, if i run ifort blah blah && ./a.out the answer is the same but the number of iterations differ.
Similarly, if i run ifort a million times, I get same answer million times but the number of iterations is sometimes different from the one given by gfortran.
The code is humongous. I will try to prune it and paste it soon but I doubt it will help.
However, there is a small chance that there may be a bug in your code, or in the Intel compiler. Given that the run time is quite modest, if there are no difficult third party library dependencies it would be nice to pin down the causes.
If you are in fact not specifying any options to either ifort or gfortran, you would certainly expect your completely different choices of optimization to have an effect.
You didn't say whether you are running on a 32-bit platform, where gfortran would default to x87 code. Depending on your platform, more consistent options between gfortran and ifort might be something like:
gfortran -march=pentium4 -mfpmath=sse -O2....
ifort -fp-model source .....
Core 2 QUAD 2.66 GHz (4 cores, 4 threads)
Ubuntu 32 Bit 10.10
3 GB RAM (Actually 8 but because of 32 bit, its showing only 3 .... I'll probably let it be)
so what flags do I use
I trying googling it but all that I could find were the usual -O3 -save - parallel etc. and stuff that I could not understand
no mention of fp model source (Sorry for being such a N00B)
[bash]$ ifort -help optwill show you optimization-related options. The correct syntax for the particular option suggested is (note the hyphen binding "fp" to "model")
[bash]$ ifort -fp-model source file1.f file2.f ...where you substitute for "file1",... the names of your source files, and ".f90" in place of ".f" if your sources are free-format Fortran.
The ifort docs for current versions are difficult to find on-line, but are installed in the /Documentation/ directory along with the /bin/ directory where ifort resides. The most important options haven't changed since the ifort 9.1 documentation was posted (as Google tells you) http://software.intel.com/en-us/articles/intel-fortran-compiler-for-linux-9x-manuals/
One exception is that the Intel 9.1 32-bit compilers defaulted to x87 code, as 32-bit gfortran still does.
No matter which compiler you choose, you do it a disservice if you aren't willing to study the docs.
Historical inertia affects gfortran and ifort in somewhat different ways, in both cases leaving you with defaults which you normally shouldn't accept. Beyond that, gcc/gfortran defaults are set according to the needs of the user community (those who speak up to the developers), while commercial compilers are influenced by marketing considerations.
We've mentioned several times on the Intel linux fortran forum, the closest gfortran option to ifort default is something like
gfortran -march=pentium4 -ffast-math -fno-cx-limited-range -O3 -funroll-loops --param max-unroll-times=4
Read about those options in the gcc manual. They are rather aggressive (besides being cumbersome) and would not often be chosen in their entirety. Yes, -O3 for gfortran is about the same as -O2 (default) for ifort, while gfortran defaults to -O0, as is widely known.
You can set the options you want for ifort in the ifort.cfg in the compiler installation, if you don't wish to include them each time you compile.
The solution to your OS not using all your RAM, and compilers defaulting to x87, of course, is to switch to the x86-64 OS, if you don't mind the discrepancies between Ubuntu and normal x86-64. The current Intel compilers have solved the installation problems for recent Ubuntu distros.
I restarted my computer. Set MKL_NUM_THREADS=4 and OMP_NUM_THREADS=4 and used -fp model source thingy and it worked.
Now my only question is (unrelated but nevertheless) my LAPACK uses 100% processor. Will there be any speedup if I ruin my life and conver the codes to ScaLAPACK ?