- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am new to this forum.
I have extracted a small test case of 90 lines (joined below) which gives an error when compiled with
ifort -real-size 64
on a x86-64 machine running Redhat Linux (and probably also on Windows XP).
The result of the calculation of an integral by a gauss routine (from CERN library) gives zero instead of the correct value (in the original program, the function to be integrated was not a constant as it is in the present test case).
The result is correct using ifort alone (32 bit precision) or ifort -g -real-size 64.
There is no problem with other compilers (GNU gfortran or Portland Group f77) on the same processor. The code runs also correctly on a alpha machine.
I would like to solve this problem because some parts of my code require 128 bits precision (not in the test case) and ifort is the only compiler (to my knowledge) to support such a precision.
I guess an experienced user will immediately see what is wrong (just 90 lines to read).
Thank you for your help.
Jean Johner
[plain]c HELIOS Program : Thermal equilibrium of a thermonuclear plasma *deck main external fipralpha0 c c Prints the values of function fipralpha0 to be integrated rhoin=0. rhofi=1. nbrho=11 rhost=(rhofi-rhoin)/(nbrho-1) do norho=1,nbrho rho=rhoin+(norho-1)*rhost ripralpha0=fipralpha0(rho) print*,"rho=",rho," ripralpha0=",ripralpha0 enddo ! norho c Computes the integral gauss0=gauss(fipralpha0,0.,1.,1.e-6) print*,"gauss0=",gauss0 stop end *deck fipralpha0 function fipralpha0(rho) data fhe/3.e-2/,tped/5./ fipralpha0=fcalpha(fhe)*tped**0.25 c fipralpha0=fcalpha(fhe)*sqrt(sqrt(tped)) c exponent 0.25 gives a zero integral c sqrt(sqrt(tped)) gives a correct integral c exponent 1.75 gives a zero integral c exponent 0.75 gives a zero integral c exponent 0.5 gives a correct integral c exponent 1.5 gives a correct integral c suppressing "fcalpha(fhe)" gives a correct result c replacing "fcalpha(fhe)" by "(1.-2.*fhe)**2" gives a correct result end *deck fcalpha function fcalpha(x) fcalpha=(1.-2.*x)**2 end *deck gauss FUNCTION GAUSS(F,A,B,EPS) * Revision 1.1.1.1 1996/04/01 15:02:13 mclareni * Mathlib gen DIMENSION W(12),X(12) data cst/0.005/ DATA X( 1) /9.6028985649753623D-1/, W( 1) /1.0122853629037626D-1/ DATA X( 2) /7.9666647741362674D-1/, W( 2) /2.2238103445337447D-1/ DATA X( 3) /5.2553240991632899D-1/, W( 3) /3.1370664587788729D-1/ DATA X( 4) /1.8343464249564980D-1/, W( 4) /3.6268378337836198D-1/ DATA X( 5) /9.8940093499164993D-1/, W( 5) /2.7152459411754095D-2/ DATA X( 6) /9.4457502307323258D-1/, W( 6) /6.2253523938647893D-2/ DATA X( 7) /8.6563120238783174D-1/, W( 7) /9.5158511682492785D-2/ DATA X( 8) /7.5540440835500303D-1/, W( 8) /1.2462897125553387D-1/ DATA X( 9) /6.1787624440264375D-1/, W( 9) /1.4959598881657673D-1/ DATA X(10) /4.5801677765722739D-1/, W(10) /1.6915651939500254D-1/ DATA X(11) /2.8160355077925891D-1/, W(11) /1.8260341504492359D-1/ DATA X(12) /9.5012509837637440D-2/, W(12) /1.8945061045506850D-1/ H=0. IF(B .EQ. A) GO TO 99 CONST=CST/ABS(B-A) BB=A 1 AA=BB BB=B 2 C1=0.5*(BB+AA) c print*,"aa=",aa," bb=",bb C2=0.5*(BB-AA) S8=0 DO I = 1,4 U=C2*X(I) S8=S8+W(I)*(F(C1+U)+F(C1-U)) enddo c print*," c2*s8=",c2*s8 S16=0 DO I = 5,12 U=C2*X(I) S16=S16+W(I)*(F(C1+U)+F(C1-U)) enddo S16=C2*S16 c print*," s16=",s16 IF(ABS(S16-C2*S8) .LE. EPS*(1.+ABS(S16))) THEN H=H+S16 IF(BB .NE. B) GO TO 1 ELSE BB=C1 IF(1.+CONST*ABS(C2) .NE. 1.) GO TO 2 H=0 PRINT*,"FUNCTION GAUSS, TOO HIGH ACCURACY REQUIRED" GOTO 99 ENDIF 99 GAUSS=H RETURN END[/plain]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't see an error with the 11.1 compiler:
[rwgreen@dpd22 71856]$ ifort -O2 -fp-model precise -o repro repro.F -real-size 64
[rwgreen@dpd22 71856]$ ./repro
rho= 0.000000000000000E+000 ripralpha0= 1.32129018308707
rho= 0.100000000000000 ripralpha0= 1.32129018308707
rho= 0.200000000000000 ripralpha0= 1.32129018308707
rho= 0.300000000000000 ripralpha0= 1.32129018308707
rho= 0.400000000000000 ripralpha0= 1.32129018308707
rho= 0.500000000000000 ripralpha0= 1.32129018308707
rho= 0.600000000000000 ripralpha0= 1.32129018308707
rho= 0.700000000000000 ripralpha0= 1.32129018308707
rho= 0.800000000000000 ripralpha0= 1.32129018308707
rho= 0.900000000000000 ripralpha0= 1.32129018308707
rho= 1.00000000000000 ripralpha0= 1.32129018308707
gauss0= 1.32129018308707
I would assume you use -fp-model precise since it's clear that you are concerned with accuracy rather than absolute performance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The reason that -g changes the behavior is that it changes the default optimization level from -O2 to -O0. The problem seems related to the inlining of function GAUSS into the main program at -O2. After the full sequence of inlining, fcalpha into fipralpha0 into gauss into main, the compiler decides to vectorize the reduction loop "DO I=5,12", which it is unable to vectorize without the full inlining sequence, or with -real-size 32.
We will investigate further how the error comes about in the inlined, vectorized loop, and let you know what we find. I can't see anything obviously wrong with your code. In the meantime, there are several ways to work around the problem. One would be to insert a directive
!DIR$ ATTRIBUTES NOINLINE :: GAUSS
into the main program (and into any other routine that calls the function GAUSS).
Another would be to insert compiler directives
!DIR$ NOVECTOR
immediately before the two loops in GAUSS at
DO I = 1,4 and DO I = 5,12
Or, finally, to compile at a reduced optimization level, for example by adding the compiler switches
-fno-inline -no-ip.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much Martyn,
I did not understand the details but clearly !DIR$ NOVECTOR added before the loops in GAUSS does fix the problem.
Perhaps you have noticed in my comments that a small modification in the writing (sqrt(sqrt(tped)) instead of tped**0.25) cures the error.
It would be nice to correct the problem since the ill configuration could occur elsewhere in the code.
Please keep me informed.
Jean Johner
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Ronald,
Thank you for your interest.
With ifort -O2 -fp-model precise -real-size 64, the error does not occur.
It seems that Martyn Corden has been able to reproduce the problem using ifort -real-size 64 alone.
Best regards.
Jean Johner
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some additional information.
Adding
implicit real*8 (a-h,o-z)
in the main and subroutines listed above,changing constants to double precision (e.g. 3.e-2 -> 3.d-2) everywhere and compiling with bare "ifort" results in the same error.
This shows that the problem is not linked withthe -real-size 64 option implementation but really to the 64 bit precision.
Best regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Tim,
On my Linux server, the ifort binary sits in the following folder:
/applications/intel/Compiler/11.1/056/bin/intel64
The ifort.cfg in this folder contains only a comment line.
Perhaps Martyn Corden (Intel)could give the version of the compiler he has used toreproduce the problem.
Best regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I always put the standards compliance/compatibility options in ifort.cfg, but it made no difference with ifort 11.1/064.
-assume protect_parens,minus0,byterecl,buffered_io -prec-div -prec-sqrt
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Tim,
Do you mean that with the 11.1/064 version and a void ifort.cfg, compilation with "ifort -real-size 64" gives the correct result?
This would mean that the problem has been repaired since 056 version. Good news!
What about the version and ifort.cfg used by Martyn Corden?
Yours sincerely.
Jean Johner
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was able to reproduce the problem in any 11.1 compiler, but not in older compilers.
It's not seen when you use -fp-model precise, because this prevents the use of fast, vectorizable math functions that are (very slightly) less accurate than the usual ones; hence, the loops with the problem don't get vectorized. The switch -no-fast-transcendentals would have the same effect.
We regret this problem; I expect it to be fixed in the next major compiler version, the developers will have to determine whether it can be fixed in an 11.1 update. In the meantime, please keep using one of these workarounds.
As an aside, (as a longtime user ofCERN software), I was amused to see the old "*deck" control cards in the source. I can't believe that CERN still uses Patchy for source management, but its ghost lives on...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good job Martyn.
I am also an old user of CERN sources. I kept this *deck separation between functions and subroutines because I find it more convenient to type /k name when searching a subroutine with vi rather than trying to remember if it is /e name (for a subroutine) or /n name (for a function).
Of course patchy is no longer in question.
Best regards.
Jean Johner
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have confirmed that this issuehas beenfixed in the 11.1 compiler update 6 (l_cprof_p_11.1.072). This updateis available for download athttps://registrationcenter.intel.com .
Regards,
Martyn

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page