- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My code is mostly REAL*8, but one section needs more precision, so I declared some variables:
REAL(16) YHF,XHF,DF,TF,RLF,ROPD,DFACT,COPD,TEMPQ,XQF,YQF,QOPD
and use those in one section:
IF (ABS(TH) .GT. 1.0E9) THEN ! FOR LONG CONJUGATE
TEMP = ABS(TH)
IF (IOPD .LT. 4) DFACT = SFLAGS(176)/TEMP
YHF = YHEIGHT*DFACT
XHF = XHEIGHT*DFACT
DF = D*DFACT
TF = DF + ZS
XQF = XARG
YQF = YARG
TEMPQ = (XHF - XQF)**2 + (YHF - YQF)**2 + TF**2
RLF = SQRT(TEMPQ)
ROPD = RLF*RINDEX(ICOL,0,JCONF)*NSPACE(0,JCONF)
ZSEP = 0.0D0
XSOLD = XARG
YSOLD = YARG
ELSE
ROPD = RL*RINDEX(ICOL,0,JCONF)*NSPACE(0,JCONF) ! OBJECT CLOSE ENOUGH TO SUBTRACT
ENDIF
All other variables are *8 or integer. The answers come back correct, and I can draw a picture in my window with the right data. Problem is when I try to print that window, calling an MFC routine from a C++ subroutine with int i = DoPreparePrinting(pInfo);. Then the whole job crashes -- in release mode only. Debug mode works just fine. If TH is less than 10**9, the IF clause is not done and nothing crashes. Usually it is greater, so this has to work.
Since I can't debug it, I have to ask whether there is some quirky error flag that gets set in the Fortran section related to the quad precision and later aborts the print job. What might that be? How can I make the above code fool-proof?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate the cogent suggestions, and I agree that there is some corruption off in the distance. That will be hard to find, so I have reworked the math and have a new algorithm that does not require subtracting large numbers. So the *16 stuff is out. I'll let you know if it still fails.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is unsafe to state conclusions based on code fragments, but it appears that if IOPD.GE.4 the variable DFACT is used without being defined, unless it has been assigned a value elsewhere in code that you have not shown.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have reformatted your code to better understand the calculation presented.
[fortran]
REAL(16) YHF_q,XHF_q,DF_q,TF_q,RLF_q,ROPD_q,DFACT_q,COPD_q,TEMP_q,XQF_q,YQF_q,QOPD_q
IF (ABS(TH) .GT. 1.0E9) THEN ! FOR LONG CONJUGATE
!
IF (IOPD .LT. 4) then
TEMP = ABS (TH)
DFACT_q = SFLAGS(176)/TEMP
else
DFACT_q = ???
end if
!
YHF_q = YHEIGHT * DFACT_q
XHF_q = XHEIGHT * DFACT_q
DF_q = D * DFACT_q
TF_q = DF_q + ZS
XQF_q = XARG
YQF_q = YARG
TEMP_q = (XHF_q - XQF_q)**2 + (YHF_q - YQF_q)**2 + TF_q**2
RLF_q = SQRT (TEMP_q)
!
ROPD = RLF_q * RINDEX(ICOL,0,JCONF) * NSPACE(0,JCONF)
ZSEP = 0.0D0
XSOLD = XARG
YSOLD = YARG
ELSE
ROPD = RL * RINDEX(ICOL,0,JCONF) * NSPACE(0,JCONF) ! OBJECT CLOSE ENOUGH TO SUBTRACT
ENDIF
!
! Does ROPD_q need to be REAL(16) ?
[/fortran]
I would suspect that the key line for addressing precision is "TEMP_q = (XHF_q - XQF_q)**2 + (YHF_q - YQF_q)**2 + TF_q**2", so it is important to maximise precision when adding the 2 difference components. The validity of your approach would remain with the accuracy available for the 5 variables used. As they are all sourced from real(8) variables, you might find that the result does not achieve what you expect.
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate the prompt (as always!) response to my question. Here are some more facts:
This section of code is entered twice. IOPD is 2 the first time, and DFACT is always initialized. Second time it is 4 and I want to subtract the values of ROPD found on the two trips. The differences in XHF and XQF and so on do not require *16 precision since the values are small and the differences large. But ROPD can be 10**20, with differences in the two trips of the order of unity. So here is where the precision is needed.
The numeric accuracy is just fine, with roundoff way out where I don't care about it. The issue is not accuracy but rather why the program crashes the minute I try to print a window. The printing is handled by C++ code, and requires going through the loops again. But the printing never crashes if I do not enter this section of Fortran code. I find this totally weird, and suspect a compiler glitch that sets an error flag somewhere that gets tested by the printer driver. Is this possible?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
so does the data you are sending to DoPreparePrinting look OK if you dump it to a temp file? Or more specifically have you examined the contents of the CPrintInfo data structure for the crash and non-crash cases? Some simple corruption from and unrelated issue could be the cause.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we do some quirky stuff, but nope, we don't set an error flag that is tested by the printer driver.
I would be more suspicious of a calling-standard mis-match. That can affect the stack in ways that don't show themselves until further downstream, and then cause all manner of bizarre fails.
How is DoPreparePrintingInfo declared in the C program? in the Fortran source? Are you using /iface:cvf in your build? Are you using other /iface:XXX switches in your build?
Finally, I would double/triple check that the same command line options are being used in both the debug and release builds (other than /check:XX or /debug or /optimize which are quite debug-specific).
--Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>The printing is handled by C++ code
You say printing, would this by chance be print formatting to a string buffer supplied by your Fortran call?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lorri:
These are all great questions. Here is the code that crashes:
BOOL CPadView::OnPreparePrinting(CPrintInfo* pInfo)
{
breathe();
isPrinting = TRUE;
pInfo->SetMaxPage( 1 );
pInfo->m_nNumPreviewPages = 1;
PRINTERBUGFIX(); // in Win95.for
int i = DoPreparePrinting(pInfo);
if ( i == 0 ) isPrinting = FALSE;
return (i);
}
The call to PRINTERBUGFIX is supposed to cure crashes. It says
SUBROUTINE PRINTERBUGFIX
USE DFLIB
INTEGER*2 FLAG
FLAG = (.NOT. FPE$UNDERFLOW) ! DEFEAT ALL ERROR HANDLING IN PRINT DRIVER
FLAG = FLAG + (.NOT. FPE$ZERODIVIDE)
C FLAG = FLAG + (.NOT. FPE$INVALID) ! THIS TOTALLY SCREWS UP!
FLAG = FLAG + (.NOT. FPE$DENORMAL)
FLAG = FLAG + (.NOT. FPE$OVERFLOW)
FLAG = FLAG + (.NOT. FPE$INEXACT)
CALL SETCONTROLFPQQ( FLAG ) ! IS MONSTER BUG IN WINDOWS, FORTRAN, AND PRINT DRIVERS
RETURN
END
(The printer has frequently crashed if I don't call it.) Until I started using *16 variables, this fix worked.
I can't examine the pInfo structure in crash mode because that's the release version, not debug. Here is the command line in release mode:
/nologo /MP /O3 /Ob0 /Oy- /Qipo /I"Release/" /reentrancy:none /extend_source:132 /Qopenmp /Qdiag-error-limit:30000 /Qdiag-file:"Release\SYNOPSYS200_lib_.diag" /Qauto /align:rec4byte /align:commons /assume:byterecl /Qzero /fpe:1 /Qfp_port /fpconstant /Qftz /iface:cvf /module:"Release/" /object:"Release/" /check:none /libs:dll /threads /winapp /c
In debug mode it's
/nologo /debug:full /MP /Od /I"Debug/" /reentrancy:none /extend_source:132 /Qopenmp /Qopenmp-report1 /warn:unused /warn:truncated_source /Qauto /align:rec4byte /align:commons /assume:byterecl /Qtrapuv /Qzero /fpe:1 /Qfp_port /fpconstant /Qftz /iface:cvf /module:"Debug/" /object:"Debug/" /traceback /check:all /libs:dll /threads /winapp /c
The C++ command line in release mode looks like
/Zi /nologo /W2 /WX- /MP /O2 /Ot /Oy- /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "_VC80_UPGRADE=0x0600" /GF /Gm- /EHsc /MT /GS /Gy- /fp:precise /Zc:wchar_t /Zc:forScope /GR /openmp /Fp".\Release\SYNOPSYS200.pch" /Fa".\Release\" /Fo".\Release\" /Fd".\Release\" /FR".\Release\" /Gd /analyze- /errorReport:queue
and in debug mode it's
/ZI /nologo /W1 /WX- /MP /Od /Ot /Oy- /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /D "_VC80_UPGRADE=0x0600" /Gm /EHsc /RTCu /MTd /GS /Gy- /fp:strict /Zc:wchar_t /Zc:forScope /GR /openmp /Fp".\Debug\SYNOPSYS200.pch" /Fa".\Debug\" /Fo".\Debug\" /Fd".\Debug\" /FR".\Debug\" /Gd /analyze- /errorReport:queue
See anything suspicious?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Don, do you depend on DFACT_q retaining its value? If so, and the code fragment is taken from a subprogram, you need to specify the SAVE attribute for the variable. Perhaps you have already done so, but we cannot tell that from your posts.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I do indeed use the SAVE attribute.
The printing is not so simple. The program has a huge OnDraw() that makes all kinds of graphics pictures. Those are rendered into memory with calls to MoveTo(), LineTo(), TextOut() and the like, and when the graphics are complete the whole business is shown onscreen with dc.BitBlt(). It all works as long as I don't use any *16 variables in the Fortran portion.
Even the print preview crashes -- so it doesn't look like a printer driver problem. (I assume that the preview is a Microsoft animal, not a product of the printer maker.) Here's the code I send to that:
void CPadView::OnFilePrintPreview()
{
m_nMagnify = 0;
m_bIsPanning = FALSE;
isPrintPreview = TRUE;
PRINTERBUGFIX();
CView::OnFilePrintPreview();
breathe();
}
The program sometimes crashes, sometimes works except none of my dialog windows will open, and that sort of thing. The combination of release mode and *16 seems to always fail in this way.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So you are writing graphics to an image buffer in memory and then sending that image (raster data) to the printer or screen which (sometimes) crashes. None of the graphic output functions will be using real(16) so the buffer and your real(16) code should be entirely unconnected.
This sounds like one of those classic cases of memory/stack/heap corruption, probably (but not necessaraly) caused by a bug in your real(16) code but it might be a bug elsewhere. Those sort of problems are usually quite painful to fix, I have had a few recently.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think we are getting close. Corruption for sure. Painful for sure. What do you suggest? (I posted my command-line arguments earlier this morning, but they have to wait for moderator approval, it seems, before you can see them.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's a clue, maybe. Thinking that the *16 variables were corrupting the heap, I tried putting them in a labeled common block. Same crash. Does that narrow things down?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, I see your command line switches, and yes, I see something suspicious.
/iface:cvf
Your print routine does not have any arguments, so if the problem *is* /iface:cvf, it's not the call to PRINTERBUGFIX() that's causing the corruption.
Can you humor me for a minute please? The bigger snippet of code that you showed at the top of this chain - is it called from C? If so, what does the interface look like from C, and what is the declaration in Fortran?
Are any of the REAL(16) variable passed arguments, or are the all local to the *one* Fortran routine?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The program starts in C, then my override of the run() loop calls a Fortran program that supervises everything. After that, C is used for I/O only. All of this was ported from CVF, so the iface directive is required, as far as I know. (Which is not very far, I admit!)
The *16 stuff exists only in a single subroutine, always called from Fortran. (There are other routines with it too, but those are used only for multithread work, which I'm not running here, so that doesn't count. I hope.)
The snippet you refer to is from a subroutine that is only called from Fortran.
I've tried changing compiler options, Fast, Strict, Source, and Precise. Same problem. I'll keep trying various combinations. It would be nice to narrow this down.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Oh, yes: the *16 variables are all local, not passed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
.NOT. is a boolean operator, and you are trying to use "+" as bit wize OR
[fortran]
FLAG = NOT(FPE$UNDERFLOW+FPE$ZERODIVIDE+FPE$INVALID+FPE$DENORMAL+FPE$OVERFLOW+FPE$INEXACT) ! disallow
or use
FLAG = FPE$UNDERFLOW+FPE$ZERODIVIDE+FPE$INVALID+FPE$DENORMAL+FPE$OVERFLOW+FPE$INEXACT ! allow
[/fortran]
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you show the C declaration and the Fortran declaration for the *first* routine then please?
The thought I'm going with, is that REAL*16 is on the stack (rather than in registers), and if the calling convention has displaced the stack pointer, ultimately causing a crash.
Now, another option is to play with the options under RELEASE until they start looking more like the DEBUG. For example, decrease the optimization level, or remove the Oy- switch. Did that make sense?
--Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could write your real(16) as:
[fortran]
df_q = SFLAGS(176) / ABS (TH)
dx_q = XHEIGHT - XARG
dy_q = YHEIGHT - YARG
tf_q = D + ZS / df_q
RLF_q = SQRT ( dx_q**2 + dy_q**2 + tf_q**2 ) * df_q
[/fortran]
It appears that df_q is very small (~10e-9) and tf_q is much larger than the others, due to ZS*TH.
I think that the real*16 is not causing the problem you are reporting, but merely an innocent bystander to other stack corruptions happening around it.
Could changing:
FLAG = (.NOT. FPE$UNDERFLOW) ! DEFEAT ALL ERROR HANDLING IN PRINT DRIVER
FLAG = FLAG + (.NOT. FPE$ZERODIVIDE)
...
to
FLAG = NOT (FPE$UNDERFLOW) ! DEFEAT ALL ERROR HANDLING IN PRINT DRIVER
FLAG = IAND (FLAG, NOT (FPE$ZERODIVIDE) )
...
make it conforming code ?
I would not expect this is the appropriate way to merge these flags ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate the cogent suggestions, and I agree that there is some corruption off in the distance. That will be hard to find, so I have reworked the math and have a new algorithm that does not require subtracting large numbers. So the *16 stuff is out. I'll let you know if it still fails.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page