Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

floating point overflow in __svml_powf4_h9

Lingzi_P_
Beginner
695 Views

The following code throws floating point overflow in __svml_powf4_h9 when compiled with -O3 and -O2:

  do i        = 1,nkd2p1

     x2 = abs ((i-1)*delk/kright)

     x2          = -apar*x2**bex

     wrkr(i*2-1) = x2

  enddo

 

Disable fpe check the program runs fine and give correct result. We have seen these kind of FPE caused by vectorise from time to time. By explicitly declare "!DIR$ NOVECTOR" does the trick but will have impact on performance.

The loop seems to be perfectly fine for vectorize and the numerical values in the result are far from overflow. I am wondering why overflow will happen?

The machine code when it crash:

   0x00000000012349f0 <+784>:   vaddpd %xmm3,%xmm2,%xmm5
   0x00000000012349f4 <+788>:   vpaddq %xmm7,%xmm4,%xmm1
   0x00000000012349f8 <+792>:   vpaddq %xmm8,%xmm5,%xmm2
=> 0x00000000012349fd <+797>:   vcvtpd2ps %xmm1,%xmm3
   0x0000000001234a01 <+801>:   vcvtpd2ps %xmm2,%xmm4
   0x0000000001234a05 <+805>:   vmovlhps %xmm4,%xmm3,%xmm1
   0x0000000001234a09 <+809>:   test   %eax,%eax
   0x0000000001234a0b <+811>:   jne    0x1234a4f <__svml_powf4_h9+879>
   0x0000000001234a0d <+813>:   vmovups 0x30(%rsp),%xmm8
   0x0000000001234a13 <+819>:   vmovaps %xmm1,%xmm0

 

(gdb) p $xmm3                                                                                                                                                                                                                                                                             
$1 = {v4_float = {1.42776291e+31, 0.708184719, -6.739982e+24, 0.690881014}, v2_double = {0.00032494041370586407, 0.00025734792206721777}, v16_int8 = {-126, 53, 52, 115, -104, 75, 53, 63, -27, 103, -78, -24, -108, -35, 48, 63}, v8_int16 = {13698, 29492, 19352, 16181, 26597, -5966, 
    -8812, 16176}, v4_int32 = {1932801410, 1060457368, -390961179, 1060167060}, v2_int64 = {4554629716295038338, 4553382854900475877}, uint128 = 0x3f30dd94e8b267e53f354b9873343582}

 

 

 

0 Kudos
6 Replies
mecej4
Honored Contributor III
695 Views

There are many missing particulars, but it strikes me that you may have a mix of floats and doubles. If an expression is evaluated as a double, and is then converted to float, with the vcvtpd2ps instruction that you flagged, the double precision value may exceed the largest representable value in single precision.

You may be able to reorganize the source code and check the types of variables involved to avoid having an intermediate result that causes overflow when the final result is well within the range of single precision.

0 Kudos
Lingzi_P_
Beginner
695 Views

Hi, thanks for the swift reply.

I had thought i checked everything and they are within the range of single precision. But it might not be the case. The code now runs fine with vectorise after following change:

>      x2 = x2**bex
>      x2          = -apar*x2

So it looks like the compiler will try to save intermediate results as doubles even the local variables are defined as single? 

mecej4 wrote:

There are many missing particulars, but it strikes me that you may have a mix of floats and doubles. If an expression is evaluated as a double, and is then converted to float, with the vcvtpd2ps instruction that you flagged, the double precision value may exceed the largest representable value in single precision.

You may be able to reorganize the source code and check the types of variables involved to avoid having an intermediate result that causes overflow when the final result is well within the range of single precision.

0 Kudos
mecej4
Honored Contributor III
695 Views

Lingzi P. wrote:

So it looks like the compiler will try to save intermediate results as doubles even the local variables are defined as single? 

Intermediate results may reside entirely in registers, and have no memory footprint at all.

The Fortran standard imposes some constraints on the precision of mixed-mode and mixed-precision expressions.

0 Kudos
Lingzi_P_
Beginner
695 Views

Sorry I made a mistake yesterday. The code below still doesn't work. It actually crashed with the exactly same error. 

I guess the compiler is just smart enough to ignore changes. Any ideas? 

 

Lingzi P. wrote:

>      x2 = x2**bex
>      x2          = -apar*x2

So it looks like the compiler will try to save intermediate results as doubles even the local variables are defined as single? 

Quote:

mecej4 wrote:

 

There are many missing particulars, but it strikes me that you may have a mix of floats and doubles. If an expression is evaluated as a double, and is then converted to float, with the vcvtpd2ps instruction that you flagged, the double precision value may exceed the largest representable value in single precision.

You may be able to reorganize the source code and check the types of variables involved to avoid having an intermediate result that causes overflow when the final result is well within the range of single precision.

 

 

0 Kudos
mecej4
Honored Contributor III
695 Views

Please provide a "reproducer": complete source code, data files (if needed) and instructions to compile, link and run in order to reproduce the error that you encountered.

0 Kudos
Lingzi_P_
Beginner
695 Views

“reproducer” with makefile is attached. 

Have tried both ifort 14.0.1 and 15.0.3. Same error.

Please note we call feenableexcept(called in ut_fpmode) at the beginning to enable all the FPE as the system could not afford to have it disables.

Comment out the feenableexcept the job runs fine and give correct result.

Thanks,

 

 

0 Kudos
Reply