- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This was a mind-boggling thing to debug...code went something like
IF(A)
C1=9
ELSE
ENDIF
D1=C1*2
Compiled with COMPAQ/CONSOLE, it would run to completion. Compiled with COMPAQ/DLL, it would crash and give me a stack trace of UNKNOWN/KERNEL32.DLL.
Compiled with IVF/CONSOLE, it crashed. The console output gave me a usable stack trace, but the debugger's own stack trace was messed up. (It stopped in a subroutine completely unrelated to where it had crashed). Even trying to step through (dis)assembly code, I couldn't get an indication that the error was an uninitialized local variable. The console stack trace said "floating overflow".
Is it possible to get a better run-time crash for uninitialized local variables? If not, what are some recommendations on detecting these types of issues before shipping off the code?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
KERNEL32! 7c812aeb()
MSVBVM60! 734f0c29()
MSVBVM60! 734ee082()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you show us a real (but short if possible) program that demonstrates the behavior you saw?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The part of code that is responsible for the crash is executed probably about 100,000 times in a given day, and I haven't seen this failure before.
The Fortran code goes like this; TIMD is a double-precision variable, and its value gets assigned to TIM. IVF compiles it so:
TIM=TIMD
007AFAAC fld qword ptr [CTIME2 (11FD9E0h)]
007AFAB2 fstp dword ptr [ebp-68h]
007AFAB5 movss xmm0,dword ptr [ebp-68h]
007AFABA movss dword ptr [TIM (0F3A5A8h)],xmm0
CVF compiles it so:
673: TIM=TIMD
025E11A8 fld qword ptr [_TIME2 (02974bb0)]
025E11AE wait
025E11AF fstp dword ptr [ebp-1E8h]
025E11B5 wait
025E11B6 mov edx,dword ptr [ebp-1E8h]
025E11BC mov dword ptr [_TIMING+28h (02a308c8)],edx
Now, I have a VB6 executable that iteratively calls the Fortran DLL and updates the text box with the current time step, TIM.
Before the call that results in Fortran setting TIM=27927.20 (or some other value, sequence-dependant), the VB6 WATCH window will show:
Expression Value Type
TIMENOW 27926.77 Single
After the call to the Fortran DLL (in the VB6 debugger), if I hit "go", I get the crash. If I step through each line, the VB6 WATCH window shows:
Expression Value Type
TIMENOW
Stepping through lines such as Text1.text = Str(TIMENOW / TIME_CONV) ensures that I do not crash in the debugger, and TIMENOW is re-set to a type Single.
If I set breakpoints at the top and the bottom of the Fortran code, I get a crash after the "bottom" breakpoint, but before the "top" breakpoint, proving to me that it's crashing in the VB6 executable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All good ideas...but here goes the "why".
Basically the few lines of code that cause trouble look like this:
program main
real a,b,c
a=1e-40
b=1e+10
if a is not 0, and some other conditions are true, then
c=b/a
print *,c
end
We compileour codewith
Underflow gives 0.0; Abort on other IEEE exceptions (/fpe:0)
So, sure enough, the Console version re-sets the variable a to 0 because it's a de-normal, and we never end up performing the c=b/a line.
But the VB driven DLL version ignores the /fpe:0 specification...so it ends up running the c=b/a line, and then c becomes Infinity, and it's fed into various sqrt and log functions...
Is there a way to force the vb.net or vb6 executable that's driving the Fortran DLL torespect the/fpe:0 specification, and actually flush underflow to 0, and crash on NaN and Inf values?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just to make sure - with /fpe-all:0, I need to set /Qftz in the project as well to flush under-flow values to 0? (In other words, is it a feature, or a bug?)
Terminology is a bit confusing, does "flush denormals to 0" mean "flush underflow floating values to 0?"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just to make sure - with /fpe-all:0, I need to set /Qftz in the project as well to flush under-flow values to 0? (In other words, is it a feature, or a bug?)
Terminology is a bit confusing, does "flush denormals to 0" mean "flush underflow floating values to 0?"
A denormal number is a number with all 0's in the exponent field and represent numeric values of
0.fraction x 2**(-bias + 1)
for single precision the fraction is 23 bits of the mantissa and bias is 127
normalized numbers are
1.fraction x 2**(exponent-bias)
for single precision the exponent is a binary value in range of 0:255 (together with bias give + and - exponents)
The denormals can represent numbers smaller than the smallest normalized number but yet numbers that have not yet produced an underflow.
underflows are results where the none of the fraction bits can be encoded into the storage format (single or double as the case may be).
flush denormals to 0 does not mean flush underflows to 0.
you might consider it as flush near underflows to 0.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is correct, I said nothing to contradict this statement.
When an internal (to FPU or SSE) generates an underflow the resultant number cannot be represented by neither normalized not denormalized. When result can be represented by denormalized but not normalized and FTZ is in effect, result is truncated to 0 as opposed to returning denormalized. The actual computation did not create an underflow, but the user chose not to see denormalized results.
It underflow becomes a question of semantics of the position in which you observe the calculation.
Should you have a function that returns 3 digits of precision then you could claim underflow when non-zero internal result is forced to return 0. i.e. viewpoint from outside function is underflow, view from inside function is non-underflow condition.
Jim

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page