Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

REAL(16) question

dondilworth
New Contributor II
435 Views

My code is mostly REAL*8, but one section needs more precision, so I declared some variables:

    REAL(16) YHF,XHF,DF,TF,RLF,ROPD,DFACT,COPD,TEMPQ,XQF,YQF,QOPD

and use those in one section:

            IF (ABS(TH) .GT. 1.0E9) THEN        ! FOR LONG CONJUGATE
                TEMP = ABS(TH)
                IF (IOPD .LT. 4) DFACT = SFLAGS(176)/TEMP
                YHF = YHEIGHT*DFACT
                XHF = XHEIGHT*DFACT
                DF = D*DFACT
                TF = DF + ZS
                XQF = XARG
                YQF = YARG
                TEMPQ = (XHF - XQF)**2 + (YHF - YQF)**2 + TF**2
                RLF = SQRT(TEMPQ) 
                ROPD = RLF*RINDEX(ICOL,0,JCONF)*NSPACE(0,JCONF)
                ZSEP = 0.0D0
                XSOLD = XARG
                YSOLD = YARG
            ELSE
                ROPD = RL*RINDEX(ICOL,0,JCONF)*NSPACE(0,JCONF)    ! OBJECT CLOSE ENOUGH TO SUBTRACT
            ENDIF
All other variables are *8 or integer.  The answers come back correct, and I can draw a picture in my window with the right data.  Problem is when I try to print that window, calling an MFC routine from a C++ subroutine with int i = DoPreparePrinting(pInfo);.  Then the whole job crashes -- in release mode only.  Debug mode works just fine.  If TH is less than 10**9, the IF clause is not done and nothing crashes.  Usually it is greater, so this has to work.

Since I can't debug it, I have to ask whether there is some quirky error flag that gets set in the Fortran section related to the quad precision and later aborts the print job.  What might that be?  How can I make the above code fool-proof?

0 Kudos
1 Solution
dondilworth
New Contributor II
435 Views

I appreciate the cogent suggestions, and I agree that there is some corruption off in the distance.  That will be hard to find, so I have reworked the math and have a new algorithm that does not require subtracting large numbers.  So the *16 stuff is out.  I'll let you know if it still fails.

View solution in original post

19 Replies
mecej4
Black Belt
435 Views

It is unsafe to state conclusions based on code fragments, but it appears that if IOPD.GE.4 the variable DFACT is used without being defined, unless it has been assigned a value elsewhere in code that you have not shown.

John_Campbell
New Contributor II
435 Views

I have reformatted your code to better understand the calculation presented.

[fortran]
    REAL(16) YHF_q,XHF_q,DF_q,TF_q,RLF_q,ROPD_q,DFACT_q,COPD_q,TEMP_q,XQF_q,YQF_q,QOPD_q

            IF (ABS(TH) .GT. 1.0E9) THEN        ! FOR LONG CONJUGATE
!
                IF (IOPD .LT. 4) then
                   TEMP    = ABS (TH)
                   DFACT_q = SFLAGS(176)/TEMP
                else
                   DFACT_q = ???
                end if
!
                YHF_q  = YHEIGHT * DFACT_q
                XHF_q  = XHEIGHT * DFACT_q
                DF_q   = D       * DFACT_q
                TF_q   = DF_q + ZS
                XQF_q  = XARG
                YQF_q  = YARG
                TEMP_q = (XHF_q - XQF_q)**2 + (YHF_q - YQF_q)**2 + TF_q**2
                RLF_q  = SQRT (TEMP_q) 
!
                ROPD   = RLF_q * RINDEX(ICOL,0,JCONF) * NSPACE(0,JCONF)
                ZSEP   = 0.0D0
                XSOLD  = XARG
                YSOLD  = YARG
            ELSE
                ROPD   = RL * RINDEX(ICOL,0,JCONF) * NSPACE(0,JCONF)    ! OBJECT CLOSE ENOUGH TO SUBTRACT
            ENDIF
!
!       Does ROPD_q need to be REAL(16) ?
[/fortran]

I would suspect that the key line for addressing precision is "TEMP_q = (XHF_q - XQF_q)**2 + (YHF_q - YQF_q)**2 + TF_q**2", so it is important to maximise precision when adding the 2 difference components.  The validity of your approach would remain with the accuracy available for the 5 variables used. As they are all sourced from real(8) variables, you might find that the result does not achieve what you expect.

John

 

dondilworth
New Contributor II
435 Views

I appreciate the prompt (as always!) response to my question.  Here are some more facts:

This section of code is entered twice.  IOPD is 2 the first time, and DFACT is always initialized.  Second time it is 4 and I want to subtract the values of ROPD found on the two trips.  The differences in XHF and XQF and so on do not require *16 precision since the values are small and the differences large.  But ROPD can be 10**20, with differences in the two trips of the order of unity.  So here is where the precision is needed.

The numeric accuracy is just fine, with roundoff way out where I don't care about it.  The issue is not accuracy but rather why the program crashes the minute I try to print a window.  The printing is handled by C++ code, and requires going through the loops again.  But the printing never crashes if I do not enter this section of Fortran code.  I find this totally weird, and suspect a compiler glitch that sets an error flag somewhere that gets tested by the printer driver.  Is this possible?

andrew_4619
Honored Contributor II
435 Views

so does the data you are sending to DoPreparePrinting look OK if you dump it to a temp file? Or more specifically have you examined the contents of the CPrintInfo data structure for the crash and non-crash cases? Some simple corruption from and unrelated issue could be the cause.

Lorri_M_Intel
Employee
435 Views

we do some quirky stuff, but nope, we don't set an error flag that is tested by the printer driver.

I would be more suspicious of a calling-standard mis-match.  That can affect the stack in ways that don't show themselves until further downstream, and then cause all manner of bizarre fails.

How is DoPreparePrintingInfo declared in the C program?  in the Fortran source?  Are you using /iface:cvf in your build?  Are you using other /iface:XXX switches in your build?

Finally, I would double/triple check that the same command line options are being used in both the debug and release builds (other than /check:XX or /debug or /optimize  which are quite debug-specific).  

                         --Lorri  

jimdempseyatthecove
Black Belt
435 Views

>>The printing is handled by C++ code

You say printing, would this by chance be print formatting to a string buffer supplied by your Fortran call?

Jim Dempsey

dondilworth
New Contributor II
435 Views

Lorri:

These are all great questions.  Here is the code that crashes:

BOOL CPadView::OnPreparePrinting(CPrintInfo* pInfo)
{
    breathe();
    isPrinting = TRUE;
    pInfo->SetMaxPage( 1 );
    pInfo->m_nNumPreviewPages = 1;
    PRINTERBUGFIX();    // in Win95.for
    int i = DoPreparePrinting(pInfo);    
    if ( i == 0 ) isPrinting = FALSE;
    return (i);
}
The call to PRINTERBUGFIX is supposed to cure crashes.  It says

    SUBROUTINE PRINTERBUGFIX
    USE DFLIB
    INTEGER*2 FLAG

    FLAG = (.NOT. FPE$UNDERFLOW)        ! DEFEAT ALL ERROR HANDLING IN PRINT DRIVER
    FLAG = FLAG + (.NOT. FPE$ZERODIVIDE)
C    FLAG = FLAG + (.NOT. FPE$INVALID)    ! THIS TOTALLY SCREWS UP!
    FLAG = FLAG + (.NOT. FPE$DENORMAL)
    FLAG = FLAG + (.NOT. FPE$OVERFLOW)
    FLAG = FLAG + (.NOT. FPE$INEXACT)
    
    CALL SETCONTROLFPQQ( FLAG )    ! IS MONSTER BUG IN WINDOWS, FORTRAN, AND PRINT DRIVERS
    RETURN
    END

(The printer has frequently crashed if I don't call it.)  Until I started using *16 variables, this fix worked.

I can't examine the pInfo structure in crash mode because that's the release version, not debug.  Here is the command line in release mode:

/nologo /MP /O3 /Ob0 /Oy- /Qipo /I"Release/" /reentrancy:none /extend_source:132 /Qopenmp /Qdiag-error-limit:30000 /Qdiag-file:"Release\SYNOPSYS200_lib_.diag" /Qauto /align:rec4byte /align:commons /assume:byterecl /Qzero /fpe:1 /Qfp_port /fpconstant /Qftz /iface:cvf /module:"Release/" /object:"Release/" /check:none /libs:dll /threads /winapp /c

In debug mode it's

/nologo /debug:full /MP /Od /I"Debug/" /reentrancy:none /extend_source:132 /Qopenmp /Qopenmp-report1 /warn:unused /warn:truncated_source /Qauto /align:rec4byte /align:commons /assume:byterecl /Qtrapuv /Qzero /fpe:1 /Qfp_port /fpconstant /Qftz /iface:cvf /module:"Debug/" /object:"Debug/" /traceback /check:all /libs:dll /threads /winapp /c

The C++ command line in release mode looks like

/Zi /nologo /W2 /WX- /MP /O2 /Ot /Oy- /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "_VC80_UPGRADE=0x0600" /GF /Gm- /EHsc /MT /GS /Gy- /fp:precise /Zc:wchar_t /Zc:forScope /GR /openmp /Fp".\Release\SYNOPSYS200.pch" /Fa".\Release\" /Fo".\Release\" /Fd".\Release\" /FR".\Release\" /Gd /analyze- /errorReport:queue

and in debug mode it's

/ZI /nologo /W1 /WX- /MP /Od /Ot /Oy- /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /D "_VC80_UPGRADE=0x0600" /Gm /EHsc /RTCu /MTd /GS /Gy- /fp:strict /Zc:wchar_t /Zc:forScope /GR /openmp /Fp".\Debug\SYNOPSYS200.pch" /Fa".\Debug\" /Fo".\Debug\" /Fd".\Debug\" /FR".\Debug\" /Gd /analyze- /errorReport:queue

See anything suspicious?

mecej4
Black Belt
435 Views

Don, do you depend on DFACT_q retaining its value? If so, and the code fragment is taken from a subprogram, you need to specify the SAVE attribute for the variable. Perhaps you have already done so, but we cannot tell that from your posts.

dondilworth
New Contributor II
435 Views

I do indeed use the SAVE attribute.

The printing is not so simple.  The program has a huge OnDraw() that makes all kinds of graphics pictures.  Those are rendered into memory with calls to MoveTo(), LineTo(), TextOut() and the like, and when the graphics are complete the whole business is shown onscreen with dc.BitBlt().  It all works as long as I don't use any *16 variables in the Fortran portion.

Even the print preview crashes -- so it doesn't look like a printer driver problem.  (I assume that the preview is a Microsoft animal, not a product of the printer maker.)  Here's the code I send to that:

void CPadView::OnFilePrintPreview()
{
    m_nMagnify = 0;    
    m_bIsPanning = FALSE;
    isPrintPreview = TRUE;
    PRINTERBUGFIX();
    CView::OnFilePrintPreview();
    breathe();    
}
The program sometimes crashes, sometimes works except none of my dialog windows will open, and that sort of thing.  The combination of release mode and *16 seems to always fail in this way.

andrew_4619
Honored Contributor II
435 Views

So you are writing graphics to an image buffer in memory and then sending that image (raster data) to the printer or screen which (sometimes) crashes. None of the graphic output functions will be using real(16) so the buffer and your real(16) code should be entirely unconnected.

This sounds like one of those classic cases of memory/stack/heap corruption, probably (but not necessaraly)  caused by a bug in your real(16) code but it might be a bug elsewhere. Those sort of problems are usually quite painful to fix, I have had a few recently.

 

dondilworth
New Contributor II
435 Views

I think we are getting close.  Corruption for sure.  Painful for sure.  What do you suggest?  (I posted my command-line arguments earlier this morning, but they have to wait for moderator approval, it seems, before you can see them.)

dondilworth
New Contributor II
435 Views

Here's a clue, maybe.  Thinking that the *16 variables were corrupting the heap, I tried putting them in a labeled common block.  Same crash.  Does that narrow things down?

Lorri_M_Intel
Employee
435 Views

OK, I see your command line switches, and yes, I see something suspicious.

/iface:cvf

Your print routine does not have any arguments, so if the problem *is* /iface:cvf, it's not the call to PRINTERBUGFIX() that's causing the corruption.

Can you humor me for a minute please?  The bigger snippet of code that you showed at the top of this chain - is it called from C? If so, what does the interface look like from C, and what is the declaration in Fortran?

Are any of the REAL(16) variable passed arguments, or are the all local to the *one* Fortran routine?

 

dondilworth
New Contributor II
435 Views

The program starts in C, then my override of the run() loop calls a Fortran program that supervises everything.  After that, C is used for I/O only.  All of this was ported from CVF, so the iface directive is required, as far as I know.  (Which is not very far, I admit!)

The *16 stuff exists only in a single subroutine, always called from Fortran.  (There are other routines with it too, but those are used only for multithread work, which I'm not running here, so that doesn't count.  I hope.)

The snippet you refer to is from a subroutine that is only called from Fortran.

I've tried changing compiler options, Fast, Strict, Source, and Precise.  Same problem.  I'll keep trying various combinations.  It would be nice to narrow this down.

dondilworth
New Contributor II
435 Views

Oh, yes: the *16 variables are all local, not passed.

jimdempseyatthecove
Black Belt
435 Views

.NOT. is a boolean operator, and you are trying to use "+" as bit wize OR

[fortran]

 FLAG = NOT(FPE$UNDERFLOW+FPE$ZERODIVIDE+FPE$INVALID+FPE$DENORMAL+FPE$OVERFLOW+FPE$INEXACT) ! disallow

or use

 FLAG = FPE$UNDERFLOW+FPE$ZERODIVIDE+FPE$INVALID+FPE$DENORMAL+FPE$OVERFLOW+FPE$INEXACT ! allow
 [/fortran]

Jim Dempsey

Lorri_M_Intel
Employee
435 Views

Can you show the C declaration and the Fortran declaration for the *first* routine then please?

The thought I'm going with, is that REAL*16 is on the stack (rather than in registers), and if the calling convention has displaced the stack pointer, ultimately causing a crash.

Now, another option is to play with the options under RELEASE until they start looking more like the DEBUG.  For example, decrease the optimization level, or remove the Oy- switch.   Did that make sense?

                   --Lorri

 

John_Campbell
New Contributor II
435 Views

You could write your real(16) as:

[fortran]
       df_q  = SFLAGS(176) / ABS (TH)
      dx_q  = XHEIGHT - XARG
      dy_q  = YHEIGHT - YARG
      tf_q  = D + ZS / df_q
      RLF_q = SQRT ( dx_q**2 + dy_q**2 + tf_q**2 ) * df_q
[/fortran]

It appears that df_q is very small (~10e-9) and tf_q is much larger than the others, due to ZS*TH.

I think that the real*16 is not causing the problem you are reporting, but merely an innocent bystander to other stack corruptions happening around it.

Could changing:
    FLAG = (.NOT. FPE$UNDERFLOW)        ! DEFEAT ALL ERROR HANDLING IN PRINT DRIVER
    FLAG = FLAG + (.NOT. FPE$ZERODIVIDE)
...
to
    FLAG = NOT (FPE$UNDERFLOW)        ! DEFEAT ALL ERROR HANDLING IN PRINT DRIVER
    FLAG = IAND (FLAG, NOT (FPE$ZERODIVIDE) )
...
make it conforming code ?
I would not expect this is the appropriate way to merge these flags ?

dondilworth
New Contributor II
436 Views

I appreciate the cogent suggestions, and I agree that there is some corruption off in the distance.  That will be hard to find, so I have reworked the math and have a new algorithm that does not require subtracting large numbers.  So the *16 stuff is out.  I'll let you know if it still fails.

Reply