OpenMP + Optimization (/O2 or /O3) causes floating overflow - Page 2

Jauch · ‎03-27-2014

Hi,

I have a strange (to me) problem.
In a big project, with OpenMP enabled by default.
In this project, I have 2 modules: A and B, where module A uses module B.

Module B do not have any OpenMP directive, nor Auto-parallelism is enabled.
Inside Module A, the subroutines that calls Module B also do not have any OpenMP directive, nor do have any of the calling subroutines.

When I compile the code, and disable OpenMP, OR disable Optimizations, OR disable both, only for the ModuleB, everything goes fine.
If I compile the ModuleB with both Optimization (/O2 or /O3) and OpenMP, the run craches after sometime with a floatpoint overflow.

Could you guide me to what kind of tests I should do in order to find the problem?

I'm using the latest Intel fortran.

Best regards,
Eduardo

jimdempseyatthecove · ‎03-30-2014

Additional hint.

To quickly testy the hypothesis of the dummy argument pointing to a temporary array:

After the call that sets up the Concentration3D array pointer you can insert a test to verify the pointer is valid.

For example, if the pointer is supposed to point to a slice of a larger array you can test to see if the resultant Concentration3D array (entire array) falls within the known possible positions.

Jim Dempsey

Jauch · ‎04-01-2014

Hello,

After 3 days digging for what was happening, I found what seems to be the probable cause of the problem.

The model was running with a parameter that caused a certain algorithm to become too much sensitive to very small changes in the Concentration3D values. This algorithm used concentration to calculate coefficients, used later to calculate new concentrations. Because very small variations in the concentration values caused big changes in the coefficients, at some point the contrary impact of other calculations lost its effectiveness and this create a vicious circle with a big instability that lead, in the end, to the floating overflow.

I still suspects that we have some kind of problem in the code, due to the fact that a simple print in a routine (a routine that did not changed Concentration3D values) could stop this behavior. So, I'll continue to investigate.

In fact, I was able to find two small bugs (not related to my problem) during this investigation.
I would like to thank everybody that make the effort to help.

If I find the problem and it is something in our code, I came back here to tell.

If is something with the compiler (less probable), I'll create a schematic case test and send to the support (with knowledge here).

Thanks,

Eduardo

TimP · ‎04-01-2014

If you have numerical issues, including those affected by adding prints, you might consider including /assume:protect_parens /Qprec-div /Qprec-sqrt or even /fp:source if this doesn't slow it down. As far as I'm concerned, protect_parens should be used always. The option /Qftz is a default on account of the behavior of CPUs prior to corei7-3; there should be no need to use it for performance on the last 2 CPU generations.

Such sensitivities may also indicate a need for algorithmic investigation, as you suggested, or promotion of critical single precision calculations to double.

If you have source code bugs, sometimes they will be triggered in the absence of prints by optimizations such as /Qip. I'm not advocating removal of that optimization except as a step toward diagnosing a possible bug.