Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29255 Diskussionen

What does "Floating Point Stack Check" mean

ferrad
Neuer Benutzer
3.481Aufrufe
My code was merrily running along for a few minutes, then suddenly stopped in the debugger with a "Floating Point Stack Check", and took me to the offending line, which is a calculation:
Code:
          new_flux(isol) =
     .        d1_curr*h*(d0*(temp_res-temperature(istage,ifluid))-sume)/
     .           (d0*d1_curr + h*(d1_curr*sqrt(delt_loc) + d0*delt_loc))


Is the line too complex? Why only now after happily executing it hundreds of times before?
Adrian
0 Kudos
12 Antworten
ferrad
Neuer Benutzer
3.481Aufrufe
I tried breaking this calc up onto two lines, then it just stopped again later on another line with the same error.
Adrian
TimP
Geehrter Beitragender III
3.481Aufrufe
If you are using a compilation for x87 code (e.g. ifort 32-bit with no SSE options, such as -QxW, or maybe when using -Op or -fltconsistency), it looks like the compiler has exceeded the stack register size (8 items on stack). This would be likely to be a compiler bug, as the compiler should be tracking stack size. If so, reducing optimization level might well get past it. It used to be a common bug in gcc, when using mathinline functions. If you are using a current compiler, you might file a bug report.
Better alternatives to -Op and -fltconsistency are coming.
If my guesses above are off the mark, please give some of the missing information.
Steven_L_Intel1
Mitarbeiter
3.481Aufrufe
The usual cause of this problem in Fortran code is calling a function which returns a real but the caller thinks it returns an integer, or vice-versa. The effect of this error may not be apparent at the point of the call, it may occur later.

Check all your function declarations carefully. Try building with /gen_interfaces /warn:interfaces and see if the compiler alerts you to a mismatch.
ferrad
Neuer Benutzer
3.481Aufrufe

Steve,

Iadded /gen_interfaces /warn:interfaces to the compilation flags. It compiles a few files, then get stuck on one file. On checking Task Manager, I see fortcom.exe is taking up 99% of the CPU time.

Adrian

Steven_L_Intel1
Mitarbeiter
3.481Aufrufe
Yeah, I was afraid of that. Though we've fixed some similar problems with that feature, I still see that happen in some cases. Please submit a test case of that to Intel Premier Support.

So you're back to checking the declarations manually. You could always compile with /warn:declarations that forces you to explicitly declare everything.
ferrad
Neuer Benutzer
3.481Aufrufe

The Fortran DLL it is stopping in, is being called from a new C DLL. Is it possible the cause could be in the C DLL? or do I need to look in the Fortran DLL. The reason I ask is that the C DLL is new, and the FTN code has been running fine without it.

Adrian

Steven_L_Intel1
Mitarbeiter
3.481Aufrufe
Unlikely - it's harder to make that mistake in C.

Can you rebuild the C DLL and link against the C debug libraries? They have a feature for detecting this as it happens.
ferrad
Neuer Benutzer
3.481Aufrufe
Yes, I have built the C DLL with full debug (/Z7). Can I increase the size of this "Floating-Point Stack", if that's what the problem is?
Adrian
Steven_L_Intel1
Mitarbeiter
3.481Aufrufe
No, the problem is not a stack overflow in the traditional sense.

The X87 floating point unit has an internal stack where values get pushed on and then popped off as operations are done. As long as the pushes and pops match, you're fine. If you try to pop something and the stack is empty, you get a FP stack check.

When you call a function that returns a float, the return value is left on the stack when the function returns. The caller is supposed to pop it off. If you call a function where the caller thinks the function is REAL but the function itself is INTEGER (hence nothing on the FP stack), error. You can also get an error later if a value is left hanging on the FP stack.

Resolving the type inconsistencies is the only fix.
ferrad
Neuer Benutzer
3.481Aufrufe

For the record, I fixed this eventually. As I thought, it was in the C code which calls the Fortran DLL. I found that if I called a certain function in C, it failed in the FTN code after the 11th time calling the C function. And if I didnt call the function, it never failed in the FTN code.

On diagnosing that C function, I tracked it down to the use of the sqrt() function. Should be innocuous, but was not included. Now why it didn't fail with an unresolved external on the link of the C DLL I've no idea, but I guess it thought it had a sqrt function somewhere it could use. Well the one it used (who knows where it found one) I guess pushed and popped the floating point stack just badly enough to fail after the 11th occasion.

I included math.h, and all works just fine.

Adrian

Steven_L_Intel1
Mitarbeiter
3.481Aufrufe
You must have disabled the usual warnings about undeclared functions in C. I don't know what C assumes for such things, but it certainly wasn't that it returns a float. You would have found the normal libm version of sqrt which returned a float but never got it popped off (nor would the return value be correct.)
Glad to hear that you found it.
Dishaw__Jim
Einsteiger
3.481Aufrufe
By default C assumes that all functions will return an integer--that would match your analysis of the problem. If there is no return statement in a function, many (if not all) compilers will push a 0 on the stack so that there is a value to return. This can lead to all sorts of nasty bugs.



Not including math.h would not cause the linker to fail. The #include directive is a preprocessor directive. The math.h header defines data structure and function prototypes for the math routines (like sqrt, sin, etc). Without that header, the compiler would have assumed that sqrt returns an int and takes an arbitrary number of arguments.

Message Edited by jamesd42 on 02-15-2006 06:05 PM

Antworten