Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Why are arithmetic operations with NaN slow ?

Vishnu
Novice
590 Views

I wanted to fill parts of a matrix with quiet NaNs because:

i) If some result is a NaN, I know those parts are being used.

ii) It simplifies the code since I can do vector/elemental operations on the whole matrix, and I assumed that arithmetic ops with NaNs would take very less CPU time, and so the cost of simplifying the code would be less. But it would seem that I assumed wrong.

This is a sample code I wrote to test:

program check9

    use, intrinsic :: ieee_arithmetic

    implicit none

    integer(kind=8) :: i
    real :: NaN, a, start, finish, time_cpu

    call cpu_time(start)

    a = 2.0

    NaN = ieee_value(0.0, ieee_quiet_nan)

    do i = 1, 1000000000
        a = a / NaN
        a = a * NaN
    end do


    print*, a, NaN

    call cpu_time(finish)
    time_cpu = finish - start
    print*, "CPU time =", time_cpu

end program check9

This takes me about 3 seconds with ifort. The same ops with a real number is so fast it doesn't even register any significant digits.

0 Kudos
3 Replies
TimP
Honored Contributor III
590 Views

NaN operations on Intel CPUs are likely to generate exceptions which invoke microcode, so the relative slowdown probably varies greatly with CPU model.  It seems you have also the possibility that the compiler might shortcut your test loop.

0 Kudos
Vishnu
Novice
590 Views

Do you think my overhead will be the lowest if I use REAL(0)s instead? The operations involved will be additions, subtractions and multiplications.

0 Kudos
jimdempseyatthecove
Honored Contributor III
590 Views

>>Do you think my overhead will be the lowest if I use REAL(0)s instead?

I suggest inserting an unusual number.

You could insert a negative 0.0 or a sub-normal number. You can precondition the SSE to set FTZ or DAZ to assure that if written, it will be overwritten with +0.0.

The only downside of this is, this will not indicate if the contained value were used for computation. In this case you would want to insert an SNaN

https://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz

Jim Dempsey

0 Kudos
Reply