Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

SSE Instruction Generates Floating Point Exception

Allen_Barnett
Beginner
401 Views

Hi: I've been trying to improve the robustness of our code. To that end, I enabled (maybe unmasked is the right word) floating point exceptions. However, when optimization is turned on, the code seems to generate spurious exceptions. Here is the simplified routine:

program example
  real :: a(3), b(3), c(3)
  integer :: d(3)
  a = (/ 1., 2., 3. /)
  b = (/ 0., 0., 0. /)
  c = (/ 1., 1., 1. /)
  call subr ( a, b, c, d )
end program example
subroutine subr ( a, b, c, d )
  real, intent(in) :: a(3), b(3), c(3)
  integer, intent(out) :: d(3)
  d = int( ( a - b ) / c )
end subroutine subr

If I compile this with "ifort -fpe0 example.f90" I get:

$ ./a.out
forrtl: error (65): floating invalid
... traceback ...

The nub seems to be that the compiler generates "divps" for the division in subr() but it has only loaded two values into the low order floats in the XMM registers; the high order floats are zero. This leads to division of 0 by 0 (i.e. NaN). But, the high order XMM floats aren't saved so the calculation ultimately produces the correct answer.

This happens both with version 13.1.3 20130607 on linux and 14.0.3.202 Build 20140422 on windows.

Thanks,
Allen

0 Kudos
5 Replies
jimdempseyatthecove
Honored Contributor III
401 Views

This should be a compiler error. If the code generated is going to partially fill a vector, then whatever operations follow should not generate a fault (QNaN is ok). FWIW the compiler should have pulled in 3 floats into one register though it may have determined your architecture was better suited to scalar operations.

Jim Dempsey

0 Kudos
Steven_L_Intel1
Employee
401 Views

Escalated as issue DPD200357743. I note that the compiler figures out that it can compute the third element at compile-time and just moves the value, but it does the subtract and divide for the other two. I will let you know of any progress.

0 Kudos
Allen_Barnett
Beginner
401 Views

In the real code, of course, the compiler doesn't know what arguments it receives and generates a complete sequence of operations; although it still does two elements at once and then computes the third element separately. I can supply that exact code, too, if you need it. But, I'm with Jim, it seems like it should be doing all three at once (and maybe loading a 1. into the top word of the XMM denominator). If only the universe was four dimensional :-)

As ever, thanks for the help.
Allen

0 Kudos
Steven_L_Intel1
Employee
401 Views

This problem has been fixed - I expect the fix to appear in Update 1 to the version 15.0 compiler. This update is scheduled for sometime in October.

0 Kudos
Allen_Barnett
Beginner
401 Views

That's great! Thanks!
 

0 Kudos
Reply