I have got a code which works well in debug ver but gives segmentation fault in the optimized ver. I optimized with:
-ipo -O3 -r8 -c -fPIC -save
After quite a lot of debugging, it seems that it was most likely due to a bug in the compiler. I can't do more debugging since the error part of the code was not written by me. Also, newer ver of Intel compiler works fine with the opt ver. However, the cluster I'm using is stuck with the old ver
I realised that if I use O1 instead of O3, and no ipo for 2 of the problematic source codes (total about 10 - 12), it works well. At least it's partially optimized.
So my qn is if it is ok to mix O1/O3 for different source files and built them together. The main difference is O1 instead of O3, and no ipo. Will it cause any problem? Also in that case, it is still useful to use ipo for the other files?
Can I also use PGO optimization to further optimize the code?
Varying optimization levels among source files is entirely normal. If you have a problem with more aggressive optimizations, you would limit them to performance critical code which you are able to test sufficiently.
So I can use my workstation to compile with -ip -O3 -ipo to get the obj file.
Then I move it to the cluster to link?
But they're different types of linux, with different location for lib file etc. Can it work?
When you link, you may require the libraries furnished with the same major version of the compiler which generated object files.
The executable you create on your workstation should run if the glibc version of linux is the same (or newer).
You may need to take along Intel .so libraries where you use dynamic linking.
The requirements are:
a) the programs are the same bitness
b) the instruction set generated is compatible with target machine
c) any libraries are compatible and present (e.g. .so or .dll)
d) the location of the .so or .dll is in the PATH and/or appropriate environment variables
This is no different than what is required for any other application that is distributed as executable or binary.
If you have an issue with a static linked application to a library that exists on the cluster that is not in a redistributable that can be copied to, or referenced from, the workstation, then produce the .o (.obj) files on the workstation, but then link on the cluster.
I tried to use Profile-guided Optimization (PGO) to further optimize my code. However, my code got similar error as before (when I use O2/3 instead of O1). So does it mean that I can't use Profile-guided Optimization (PGO) to do further optimization?