vectorization of X**3 with -fpe0 and -fp-model precise; also thread safety

jespersen · ‎03-13-2012

Consider the following code in file sub3.f:
SUBROUTINE SUB3 ( RHO,JD )
IMPLICIT NONE
INTEGER, INTENT (IN) :: JD
REAL*8, DIMENSION(JD), INTENT (INOUT) :: RHO
INTEGER :: J
C$OMP PARALLEL DO SHARED( RHO )
DO J = 1,JD
RHO(J) = RHO(J)**3
ENDDO
RETURN
END

Using ifort Version 12.1 Build 20111011, compile as
(1) ifort -g -O3 -vec-report3 -c sub3.f
(2) ifort -g -O3 -vec-report3 -fpe0 -c sub3.f
(3) ifort -g -O3 -vec-report3 -fp-model precise -c sub3.f
(4) ifort -g -O3 -vec-report3 -fpe0 -fp-model precise -c sub3.f

I find that (1), (2), and (3) vectorize the loop but (4) fails to vectorize.
It looks to me like (4) inserts a function call in the loop (hence the
failure to vectorize), and I believe the function called is __powr8i4.
I'm unsure why the combination "-fpe0 -fp-model precise" results
in a function call, maybe someone can enlighten me.

Now suppose you compile via
(5) ifort -openmp -g -O3 -vec-report3 -fpe0 -fp-model precise -c sub3.f
Now you get an OpenMP loop with a function call in it, I believe the
call is to __powr8i4. Is this call thread-safe? I ask because in the full
code of which this is a cartoon, when I run with OpenMP I can get
strange behavior (sometimes segmentation violation, sometimes
floating invalid, sometimes integer divide by zero), and if I get a core
file then a stack trace points to the line with the third power and debuggers claim
the code has died in __powr8i4.
So is the combination of "fpe0 -fp-model precise" with X**3 thread-safe?

Steven_L_Intel1 · ‎03-14-2012

The math library routines to do exponentiation, etc., are thread-safe. I can't think of anything related to -fpe and -fp-model that would affect thread safety.