Overflow in NORM2 with moderately large values

Arjen_Markus · ‎02-13-2019

A recent discussion in the gfortran mailing list inspried me to look into the behaviour of Intel Fortran's NORM2. (The article by Hanson and Hopkins provides a lot of information on various implementations of the algorithm - https://www.researchgate.net/publication/298896236). The problem is that naïve implementations can easily lead to overflows or underflows corrupting the outcome. I used Intel Fortran 18 for the program below (checking for overflow only):

program chk_norm2
    implicit none

    real               :: expected
    real, dimension(2) :: vector
    integer            :: i

    !vector = [3.0/10.0 * sqrt(huge(1.0)), 4.0/10.0 * sqrt(huge(1.0))]
    vector   = [3.0, 4.0]
    expected = 5.0

    write(*,*) sqrt(huge(1.0)), huge(1.0)
    do i = 1,20
        vector   = vector * 10.0
        expected = expected * 10.0
        write(*,*) i, norm2(vector), sqrt(sum(vector**2)), expected
    enddo
end program chk_norm2

And got the following result

  1.8446743E+19  3.4028235E+38
           1   50.00000       50.00000       50.00000    
           2   500.0000       500.0000       500.0000    
           3   5000.000       5000.000       5000.000    
           4   50000.00       50000.00       50000.00    
           5   500000.0       500000.0       500000.0    
           6   5000000.       5000000.       5000000.    
           7  5.0000000E+07  5.0000000E+07  5.0000000E+07
           8  5.0000000E+08  5.0000000E+08  5.0000000E+08
           9  5.0000000E+09  5.0000000E+09  5.0000000E+09
          10  4.9999999E+10  4.9999999E+10  4.9999999E+10
          11  5.0000000E+11  5.0000000E+11  5.0000000E+11
          12  4.9999999E+12  4.9999999E+12  4.9999999E+12
          13  5.0000000E+13  5.0000000E+13  5.0000000E+13
          14  4.9999999E+14  4.9999999E+14  4.9999999E+14
          15  4.9999996E+15  4.9999996E+15  5.0000001E+15
          16  4.9999999E+16  4.9999999E+16  4.9999999E+16
          17  4.9999999E+17  4.9999999E+17  4.9999999E+17
          18  5.0000000E+18  5.0000000E+18  5.0000000E+18
          19       Infinity       Infinity  5.0000001E+19
          20       Infinity       Infinity  5.0000001E+20

As you can see, with only moderately large values of the elements in the vector you get overflow resulting in Infinity.

IIUIC, the standard only recommends but does not require, that the implementation take care of such situations where intermediate results may cause overflow/underflow. Still, one would expect a bit more care ;).

Johannes_Rieke · ‎02-14-2019

Very interesting.

ifort 19.0.2.190 delivers the same results, unfortunately. Switching to double makes no difference (slighly modified...):

program chk_norm2
    use iso_fortran_env, only : rk => real64
    !~ use iso_fortran_env, only : rk => real32
    implicit none

    real(rk)               :: expected, expected_ini
    real(rk), dimension(2) :: vector, vector_ini
    integer                :: i

    !vector = [3.0/10.0 * sqrt(huge(1.0)), 4.0/10.0 * sqrt(huge(1.0))]
    vector_ini   = [3.0_rk, 4.0_rk]
    expected_ini = 5.0_rk

    write(*,*) sqrt(huge(1.0_rk)), huge(1.0_rk)
    do i = 150,160
        vector   = vector_ini   * 10.0_rk**i
        expected = expected_ini * 10.0_rk**i
        write(*,'(i3,3ES20.2)') i, norm2(vector), sqrt(sum(vector**2)), expected
    enddo
end program chk_norm2

D:\...>ifort norm2_test_double.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.2.190 Build 20190117
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.16.27026.1
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:norm2_test_double.exe
-subsystem:console
norm2_test_double.obj

D:\....>norm2_test_double.exe
  1.340780792994260E+154  1.797693134862316E+308
150            5.00+150            5.00+150            5.00+150
151            5.00+151            5.00+151            5.00+151
152            5.00+152            5.00+152            5.00+152
153            5.00+153            5.00+153            5.00+153
154            Infinity            Infinity            5.00+154
155            Infinity            Infinity            5.00+155
156            Infinity            Infinity            5.00+156
157            Infinity            Infinity            5.00+157
158            Infinity            Infinity            5.00+158
159            Infinity            Infinity            5.00+159
160            Infinity            Infinity            5.00+160

Further I've seen in past situations where the use of norm2(vector) instead of sqrt(sum(vector**2)) in loops has a negative impact on runtime (-O2 and -O3). If someone of Intel looks on the implementation of norm2, maybe the performance can be anhanced along with a better overflow/underflow handling.

TimP · ‎02-14-2019

Algorithms which avoid squaring may be slower even where vectorization isn't possible. One such would be promotion to x87 "real(10)."

Arjen_Markus · ‎02-14-2019

The algorithms I have seen use scaling - they do not avoid squaring. And indeed vectorisation is an issue.

jimdempseyatthecove · ‎02-14-2019

This has nothing to do with vectorization. IEEE 754 for 32-bit floating point has an exponent decimal max 38.23 (8-bit exponent range of -126 to +127).. Posting #1 "failure" occurs at step 19, where the exponent of vector(1) is 19, and the mantissa is other than 1. Same for vector(2). The product of the square of each cell will have a decimal exponent of 38 (plus some fractional decimal due to the mantissa being other than 1). IOW the sum of the squares (xE19.nn**2 + xE19.nn**2) is exceeding decimal value of 38.23 (exceeding 8-bit exponent of +127).

Jim Dempsey

jimdempseyatthecove · ‎02-14-2019

If this is a problem for your application, I suggest you write your own NORM2 (e.g. my_NORM2) up-sizes real to real(8), and real(8) to real(16), I will let you work out real(16) to ....

Steve, if you read this, does the standards require for NORM2 that the squares be made using an up-sized temporary?

Jim Dempsey

Arjen_Markus · ‎02-14-2019

It is not a problem for my applications, but I simply came across the general problem (and the soliutions) and though I'd check the behaviour in Intel Fortran. Note that BLAS and LAPACK use more sophisticated implementations for thes reasons.

jimdempseyatthecove · ‎02-14-2019

FWIW

!  test_NORM2.f90 
!
!  FUNCTIONS:
!  test_NORM2 - Entry point of console application.
!

!****************************************************************************
!
!  PROGRAM: test_NORM2
!
!  PURPOSE:  Entry point for the console application.
!
!****************************************************************************
module my_module
    interface my_NORM2
        module procedure real_my_NORM2, real8_my_NORM2
    end interface my_NORM2
    contains
        function real_my_NORM2(x) result(ret)
            real :: ret
            real :: x(:)
            real(8) :: temp
            integer :: i
            temp = 0.0_8
            do i=1, size(x)
                temp = temp + real(x(i),kind(temp))**2
            end do
            ret = real(sqrt(temp),kind(ret))
        end function real_my_NORM2
        function real8_my_NORM2(x) result(ret)
            real(8) :: ret
            real(8) :: x(:)
            real(16) :: temp
            integer :: i
            temp = 0.0_8
            do i=1, size(x)
                temp = temp + real(x(i),kind(temp))**2
            end do
            ret = real(sqrt(temp),kind(ret))
        end function real8_my_NORM2
end module my_module
    program test_NORM2
    use my_module
    implicit none

    real               :: expected
    real, dimension(2) :: vector
    real(8)               :: expected8
    real(8), dimension(2) :: vector8
    integer            :: i

    !vector = [3.0/10.0 * sqrt(huge(1.0)), 4.0/10.0 * sqrt(huge(1.0))]
    vector   = [3.0, 4.0]
    expected = 5.0
    vector8   = [3.0_8, 4.0_8]
    expected8 = 5.0_8

    write(*,*) sqrt(huge(1.0)), huge(1.0)
    write(*,*) sqrt(huge(1.0_8)), huge(1.0_8)
    do i = 1,40
        vector   = vector * 10.0
        expected = expected * 10.0
        write(*,'("real  ",i3, X, 3E20.6)') i, my_norm2(vector), sqrt(sum(vector**2)), expected
        vector8   = vector8 * 10.0_8
        expected8 = expected8 * 10.0_8
        write(*,'("real8 ",i3, X, 3E20.6)') i, my_norm2(vector8), sqrt(sum(vector8**2)), expected8
    enddo

    end program test_NORM2

Output:

  1.8446743E+19  3.4028235E+38
  1.340780792994260E+154  1.797693134862316E+308
real    1         0.500000E+02        0.500000E+02        0.500000E+02
real8   1         0.500000E+02        0.500000E+02        0.500000E+02
real    2         0.500000E+03        0.500000E+03        0.500000E+03
real8   2         0.500000E+03        0.500000E+03        0.500000E+03
real    3         0.500000E+04        0.500000E+04        0.500000E+04
real8   3         0.500000E+04        0.500000E+04        0.500000E+04
real    4         0.500000E+05        0.500000E+05        0.500000E+05
real8   4         0.500000E+05        0.500000E+05        0.500000E+05
real    5         0.500000E+06        0.500000E+06        0.500000E+06
real8   5         0.500000E+06        0.500000E+06        0.500000E+06
real    6         0.500000E+07        0.500000E+07        0.500000E+07
real8   6         0.500000E+07        0.500000E+07        0.500000E+07
real    7         0.500000E+08        0.500000E+08        0.500000E+08
real8   7         0.500000E+08        0.500000E+08        0.500000E+08
real    8         0.500000E+09        0.500000E+09        0.500000E+09
real8   8         0.500000E+09        0.500000E+09        0.500000E+09
real    9         0.500000E+10        0.500000E+10        0.500000E+10
real8   9         0.500000E+10        0.500000E+10        0.500000E+10
real   10         0.500000E+11        0.500000E+11        0.500000E+11
real8  10         0.500000E+11        0.500000E+11        0.500000E+11
real   11         0.500000E+12        0.500000E+12        0.500000E+12
real8  11         0.500000E+12        0.500000E+12        0.500000E+12
real   12         0.500000E+13        0.500000E+13        0.500000E+13
real8  12         0.500000E+13        0.500000E+13        0.500000E+13
real   13         0.500000E+14        0.500000E+14        0.500000E+14
real8  13         0.500000E+14        0.500000E+14        0.500000E+14
real   14         0.500000E+15        0.500000E+15        0.500000E+15
real8  14         0.500000E+15        0.500000E+15        0.500000E+15
real   15         0.500000E+16        0.500000E+16        0.500000E+16
real8  15         0.500000E+16        0.500000E+16        0.500000E+16
real   16         0.500000E+17        0.500000E+17        0.500000E+17
real8  16         0.500000E+17        0.500000E+17        0.500000E+17
real   17         0.500000E+18        0.500000E+18        0.500000E+18
real8  17         0.500000E+18        0.500000E+18        0.500000E+18
real   18         0.500000E+19        0.500000E+19        0.500000E+19
real8  18         0.500000E+19        0.500000E+19        0.500000E+19
real   19         0.500000E+20            Infinity        0.500000E+20
real8  19         0.500000E+20        0.500000E+20        0.500000E+20
real   20         0.500000E+21            Infinity        0.500000E+21
real8  20         0.500000E+21        0.500000E+21        0.500000E+21
real   21         0.500000E+22            Infinity        0.500000E+22
real8  21         0.500000E+22        0.500000E+22        0.500000E+22
real   22         0.500000E+23            Infinity        0.500000E+23
real8  22         0.500000E+23        0.500000E+23        0.500000E+23
real   23         0.500000E+24            Infinity        0.500000E+24
real8  23         0.500000E+24        0.500000E+24        0.500000E+24
real   24         0.500000E+25            Infinity        0.500000E+25
real8  24         0.500000E+25        0.500000E+25        0.500000E+25
real   25         0.500000E+26            Infinity        0.500000E+26
real8  25         0.500000E+26        0.500000E+26        0.500000E+26
real   26         0.500000E+27            Infinity        0.500000E+27
real8  26         0.500000E+27        0.500000E+27        0.500000E+27
real   27         0.500000E+28            Infinity        0.500000E+28
real8  27         0.500000E+28        0.500000E+28        0.500000E+28
real   28         0.500000E+29            Infinity        0.500000E+29
real8  28         0.500000E+29        0.500000E+29        0.500000E+29
real   29         0.500000E+30            Infinity        0.500000E+30
real8  29         0.500000E+30        0.500000E+30        0.500000E+30
real   30         0.500000E+31            Infinity        0.500000E+31
real8  30         0.500000E+31        0.500000E+31        0.500000E+31
real   31         0.500000E+32            Infinity        0.500000E+32
real8  31         0.500000E+32        0.500000E+32        0.500000E+32
real   32         0.500000E+33            Infinity        0.500000E+33
real8  32         0.500000E+33        0.500000E+33        0.500000E+33
real   33         0.500000E+34            Infinity        0.500000E+34
real8  33         0.500000E+34        0.500000E+34        0.500000E+34
real   34         0.500000E+35            Infinity        0.500000E+35
real8  34         0.500000E+35        0.500000E+35        0.500000E+35
real   35         0.500000E+36            Infinity        0.500000E+36
real8  35         0.500000E+36        0.500000E+36        0.500000E+36
real   36         0.500000E+37            Infinity        0.500000E+37
real8  36         0.500000E+37        0.500000E+37        0.500000E+37
real   37         0.500000E+38            Infinity        0.500000E+38
real8  37         0.500000E+38        0.500000E+38        0.500000E+38
real   38             Infinity            Infinity            Infinity
real8  38         0.500000E+39        0.500000E+39        0.500000E+39
real   39             Infinity            Infinity            Infinity
real8  39         0.500000E+40        0.500000E+40        0.500000E+40
real   40             Infinity            Infinity            Infinity
real8  40         0.500000E+41        0.500000E+41        0.500000E+41
Press any key to continue . . .

Jim Dempsey

avinashs · ‎02-14-2019

I had raised the same issue regarding the HYPOT function. A very simple implementation of the Euclidian norm function (MYNORM2) as shown in the output below does not overflow until HUGE(1.0d0) = 1.797693134862316E+308 is exceeded in double precision (KIND = 8). However, the intrinsic NORM2 function overflows at approximately SQRT(HUGE(1.0d0)) as shown in the posts above.

program main
  
  implicit none
  integer(kind = 4) :: n, i
  real(kind = 8), dimension(:), allocatable :: x

  write(*,*) huge(1.0d0)
  read *
  
  n = 5
  allocate(x(n))
  x = 1.0d0
  i = 0
  do
     i = i + 1
     if (i == 330) exit
     write(*,'(1x,a,i5,3(2x,a,es13.6))') 'i = ', i, 'x = ', maxval(x), 'IVF NORM2 = ', norm2(x), 'MYNORM2 = ', mynorm2(x)
     x = x*10.0d0
  end do
  deallocate(x)
  read *

contains

  function mynorm2(x) result(r)
    real(kind = 8) :: r
    real(kind = 8) :: x(:)
    real(kind = 8) :: temp
    temp = maxval(abs(x))
    r = temp*sqrt(sum((x/temp)**2))
  end function mynorm2
  
end program main

Selected output

 i =     1  	x =  1.000000E+00       IVF NORM2 =  2.236068E+00       MYNORM2 =  2.236068E+00
 ...
 i =   153  	x =  1.000000+152       IVF NORM2 =  2.236068+152       MYNORM2 =  2.236068+152
 i =   154  	x =  1.000000+153       IVF NORM2 =  2.236068+153       MYNORM2 =  2.236068+153
 i =   155  	x =  1.000000+154       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+154
 i =   156  	x =  1.000000+155       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+155
 ...
 i =   201  	x =  1.000000+200       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+200
 ...
 i =   251  	x =  1.000000+250       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+250
 ...
 i =   301  	x =  1.000000+300       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+300
 i =   302  	x =  1.000000+301       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+301
 i =   303  	x =  1.000000+302       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+302
 i =   304  	x =  1.000000+303       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+303
 i =   305  	x =  1.000000+304       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+304
 i =   306  	x =  1.000000+305       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+305
 i =   307  	x =  1.000000+306       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+306
 i =   308  	x =  1.000000+307       IVF NORM2 =      Infinity       MYNORM2 =  2.236068+307
 i =   309  	x =  1.000000+308       IVF NORM2 =      Infinity       MYNORM2 =      Infinity
 i =   310  	x =      Infinity       IVF NORM2 =      Infinity       MYNORM2 =           NaN
 i =   311  	x =      Infinity       IVF NORM2 =      Infinity       MYNORM2 =           NaN

jimdempseyatthecove · ‎02-15-2019

While the above protects the exponent from overflowing, a problem you have with the above solution is the lesser magnitude contributors (values of x) may get excluded during the additions. One has to match the algorithm with the expected input data values.

Jim Dempsey

John_Campbell · ‎02-17-2019

I thought NORM2 was provided to overcome the problem being reported, without the need of writing our own NORM2.
The documentation does specify:

"It is recommended that the processor compute the result without undue overflow or underflow."

jimdempseyatthecove · ‎02-18-2019

That would be my understanding (but not observation) too, with the additional requirement of using the highest precision sqrt method.

Jim Dempsey