Solved: scale a real number until mantissa is denormalized

codorniz · ‎08-21-2009

Hi,
Im am a student of mathematics. While trying out somthing like an arithmetic, I believe I found a scaling problem.

The Fortran code below, saves the smallest normalized number in variable x. With 'scale(x, -1)' the mantissa of x is supposed to be shifted to the right, since x uses the smallest exponent already. But with activated optimization, -O{1,2 or 3} : scale(x, -1) returns 0.0E+0 - That is wrong.
If I switch off any optimization -O0, then I get the expected result: scale(x, -1) = 1.112536929253601E-308

In addition I made a test combining the scalar multiplication and scaling, see for yourself in results below.

[cpp]program p_scale
  implicit none
  integer, parameter :: pr = KIND(0.0D0)
  real(KIND=pr)      :: x

  write(*,*) 'radix(x)             = ', radix(x)
  x= tiny(x)       ! smallest normalized number
  write(*,*) '(tiny)              x= ', x

  write(*,*) 'scale tiny ...'
  write(*,*) 'scale(x,-1)          = ', scale(x,-1)
  write(*,*) 'scale(x*0.5,-1)      = ', scale(x*0.5_pr,-1)
  write(*,*) 'x*0.25_pr            = ', x*0.25_pr
  write(*,*) 'scale(x,+1)          = ', scale(x,+1)
  write(*,*) 'scale(scale(x,+1),-1)= ', scale(scale(x,+1),-1)
  write(*,*) 'scale(scale(x,-1),+1)= ', scale(scale(x,-1),+1)
end program p_scale
[/cpp]

The output with activated optimization is wrong, see for yourself please.
ifort -O3 p_scale.f90 -o ps_ifoO3.out

[cpp] radix(x)             =            2
 (tiny)              x=   2.225073858507201E-308
 scale tiny ...
 scale(x,-1)          =   0.000000000000000E+000
 scale(x*0.5,-1)      =   1.112536929253601E-308
 x*0.25_pr            =   5.562684646268003E-309
 scale(x,+1)          =   4.450147717014403E-308
 scale(scale(x,+1),-1)=   2.225073858507201E-308
 scale(scale(x,-1),+1)=   0.000000000000000E+000
[/cpp]

Now the output without optimization. This is actually what I expected. But I do not want to compile the arithmetic with optimization turned off.
ifort -O0 p_scale.f90 -o ps_ifoO0.out

[plain]adix(x) = 2
(tiny) x= 2.225073858507201E-308
scale tiny ...
scale(x,-1) = 1.112536929253601E-308
scale(x*0.5,-1) = 5.562684646268003E-309
x*0.25_pr = 5.562684646268003E-309
scale(x,+1) = 4.450147717014403E-308
scale(scale(x,+1),-1)= 2.225073858507201E-308
scale(scale(x,-1),+1)= 2.225073858507201E-308[/plain]

If that is helpful:
Compiler Version : 11.0.084
My processor : Intel Core 2 Duo
Operating System : Ubuntu 8.10, 64 bit

So far, that is it. What can I do?

Thanks in advance for your answer.

TimP · ‎08-21-2009

If you want IEEE compliant behavior, including gradual underflow rather than fast abrupt underflow, you must set -no-ftz, or an option which implies -no-ftz, such as -fp-model source (for compilation of the main program). CPUs which support good performance with gradual underflow are still in the future, thus the coupling of abrupt underflow setting with optimization.

View solution in original post

TimP · ‎08-21-2009

If you want IEEE compliant behavior, including gradual underflow rather than fast abrupt underflow, you must set -no-ftz, or an option which implies -no-ftz, such as -fp-model source (for compilation of the main program). CPUs which support good performance with gradual underflow are still in the future, thus the coupling of abrupt underflow setting with optimization.

codorniz · ‎08-22-2009

Quoting - tim18

If you want IEEE compliant behavior, including gradual underflow rather than fast abrupt underflow, you must set -no-ftz, or an option which implies -no-ftz, such as -fp-model source (for compilation of the main program). CPUs which support good performance with gradual underflow are still in the future, thus the coupling of abrupt underflow setting with optimization.

Thank's, using either of both options solved my problem. And thank's a lot for the background information. :-)

TimP · ‎08-23-2009

Quoting - codorniz

Thank's, using either of both options solved my problem. And thank's a lot for the background information. :-)

If you want gradual underflow regardless of compile option,
use ieee_arithmetic
call ieee_set_underflow_mode(gradual=.true.)