sqrt for complex -- which root?

bfelipe · ‎09-17-2009

Hello:

In computing the sqrt of a complex number I found that under certain conditions (with or without -O0) a program may return the positive of the negative square root. The second time it is computed, the program below returns the positive or negative square root depending on the compile option:

user% ifort -o sqrt-O0 sqrt.f90 -O0
user% ifort -o sqrt sqrt.f90
user% ./sqrt
mu = (0.000000000000000E+000,1.12948209069891)
mu = (0.000000000000000E+000,-1.12948225383955)
user% ./sqrt-O0
mu = (0.000000000000000E+000,1.12948209069891)
mu = (0.000000000000000E+000,1.12948225383955)

How can I avoid this and get just one root? Shall I just check whether I get a positive of negative root? The behaviour seems whimsical to me. Of course, if I replace the calls to cmplx(x,y) to cmplx(x,y,kind(0.0d0)) the problem goes away, but then the imaginary part of the sqrt is 1.12948263418274. Is this last one the most "exact" value?

Thanks much.

Felipe-.
--------------
program testsqrt
implicit none

double complex k, kpc
double precision kp

k = cmplx(4.23966631715597d0,0.0d0)
kp = 4.38753934475205d0
kpc = cmplx(kp,0.0d0)
write(*,*) ' mu = ', sqrt( k*k - kp*kp )
write(*,*) ' mu = ', sqrt( k*k - kpc*kpc )

end program testsqrt

Steven_L_Intel1 · ‎09-17-2009

Prior to Fortran 2003, the standard said:

A result of type complex is the principal value with the real part greater than or equal to zero. When the real part of the result is zero, the imaginary part is greater than or equal to zero.

Fortrtan 2003 says:

A result of type complex is the principal value with the real part greater than or equal to zero. When the real part of the result is zero, the imaginary part has the same sign as the imaginary part of X.

We had to change the compiler to accomodate this. Which compiler version are you using? With 11.1, I see:

mu = (0.000000000000000E+000,1.12948209069891)
mu = (0.000000000000000E+000,1.12948225383955)

with and without optimization. I note that the "imaginary part of X" is zero in both cases.

bfelipe · ‎09-17-2009

Quoting - Steve Lionel (Intel)

Prior to Fortran 2003, the standard said:

A result of type complex is the principal value with the real part greater than or equal to zero. When the real part of the result is zero, the imaginary part is greater than or equal to zero.

Fortrtan 2003 says:

A result of type complex is the principal value with the real part greater than or equal to zero. When the real part of the result is zero, the imaginary part has the same sign as the imaginary part of X.

We had to change the compiler to accomodate this. Which compiler version are you using? With 11.1, I see:

mu = (0.000000000000000E+000,1.12948209069891)
mu = (0.000000000000000E+000,1.12948225383955)

with and without optimization.

Thanks for your quick response, Steve. I tried using 10.1 and 11.0 on IA64, both give the behaviour I described. The system I use does not have 11.1

If possible I would like to avoid testing for the sign of the imaginary part each time I compute the sqrt, which I do very often, as this may impact the speed of my code. Other than swithching to 11.1, do you see other ways out?

*** So does this mean that the Fortran 2003 standard left undefined the case when the imaginary part of X is zero?

*** What concerns me most, regardless of what the standard says, is that the optimization changed the sign of the result. I think that is dangerous.

Steven_L_Intel1 · ‎09-17-2009

IA64, hmm? I tested on IA-32. Is it possible that the imaginary part of the argument is just slightly negative in some cases? Optimization might change how the expression was computed.

I don't know of a way to control this behavior. Note that the new behavior (which was in 10.1, I think), uses the sign of the imaginary part of the argument.

bfelipe · ‎09-17-2009

I think you are right, the last bit of the variable kpcc is different with and without optimization in the program below. In the optimized version the line:

id(4) = ibset(id(4),31) ! set the last bit to 1

has not effect because the last bit is already 1. However, in the unoptimized version the last bit of kpcc is 0, so that setting it to 1 (and making it equal to kpcc in the optimized version) results in sqrt returning the negative root!

Not that I know how this can help me in getting the positive root without testing every time I call sqrt, but at least I know sqrt does not seem to be the culprit, the real culprit is the argument passed that differs in the last bit.

Felipe-.
--------------------------------------
user% cat sqrt-bit.f90
program testsqrt
implicit none

integer i, j
integer id(4)
double complex k, kpc, kpcc, foo
double precision kp

k = cmplx(4.23966631715597d0,0.0d0)
kp = 4.38753934475205d0
kpc = cmplx(kp,0.0d0)
kpcc = k*k - kpc*kpc

write(*,*) ' sizes = ', sizeof(id), sizeof(kpcc)
! print out kpcc bit by bit
id = transfer(kpcc,id)
write(*,*) ((btest(id(j),i),i=0,31),j=1,4)
write(*,*) ' mu = ', sqrt( kpcc )

id(4) = ibset(id(4),31) ! set the last bit to 1

write(*,*) ((btest(id(j),i),i=0,31),j=1,4)
kpcc = transfer(id,kpcc)
write(*,*) ' mu = ', sqrt( kpcc )

end program testsqrt

user% ifort -o sqrt-bit-O0 sqrt-bit.f90 -O0
user% ifort -o sqrt-bit sqrt-bit.f90
user% ./sqrt-bit
sizes = 16 16
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F T
mu = (0.000000000000000E+000,-1.12948225383955)
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F T
mu = (0.000000000000000E+000,-1.12948225383955)
user% ./sqrt-bit-O0
sizes = 16 16
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F F
mu = (0.000000000000000E+000,1.12948225383955)
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F T
mu = (0.000000000000000E+000,-1.12948225383955)

Olivier_C_ · ‎09-21-2009

Quoting - bfelipe

I think you are right, the last bit of the variable kpcc is different with and without optimization in the program below. In the optimized version the line:

id(4) = ibset(id(4),31) ! set the last bit to 1

has not effect because the last bit is already 1. However, in the unoptimized version the last bit of kpcc is 0, so that setting it to 1 (and making it equal to kpcc in the optimized version) results in sqrt returning the negative root!

Not that I know how this can help me in getting the positive root without testing every time I call sqrt, but at least I know sqrt does not seem to be the culprit, the real culprit is the argument passed that differs in the last bit.

Felipe-.
--------------------------------------
user% cat sqrt-bit.f90
program testsqrt
implicit none

integer i, j
integer id(4)
double complex k, kpc, kpcc, foo
double precision kp

k = cmplx(4.23966631715597d0,0.0d0)
kp = 4.38753934475205d0
kpc = cmplx(kp,0.0d0)
kpcc = k*k - kpc*kpc

write(*,*) ' sizes = ', sizeof(id), sizeof(kpcc)
! print out kpcc bit by bit
id = transfer(kpcc,id)
write(*,*) ((btest(id(j),i),i=0,31),j=1,4)
write(*,*) ' mu = ', sqrt( kpcc )

id(4) = ibset(id(4),31) ! set the last bit to 1

write(*,*) ((btest(id(j),i),i=0,31),j=1,4)
kpcc = transfer(id,kpcc)
write(*,*) ' mu = ', sqrt( kpcc )

end program testsqrt

user% ifort -o sqrt-bit-O0 sqrt-bit.f90 -O0
user% ifort -o sqrt-bit sqrt-bit.f90
user% ./sqrt-bit
sizes = 16 16
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F T
mu = (0.000000000000000E+000,-1.12948225383955)
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F T
mu = (0.000000000000000E+000,-1.12948225383955)
user% ./sqrt-bit-O0
sizes = 16 16
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F F
mu = (0.000000000000000E+000,1.12948225383955)
F F F F F F F F F F F F F F F F T T F F T T F T T T T F F F F F F F T F F T T F
T F F T F T T F F F T F T T T T T T T T T T F T F F F F F F F F F F F F F F F F
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F F T
mu = (0.000000000000000E+000,-1.12948225383955)

If you do not mind performance too much, or if you only want to make sure your code is not buggy when you change platform -compiler release, processor family such as IA64 versus x86_64 - I advise you the MPC library based on MPFR, itself based on GMP for a clean normalized by the IEEE standards implementation of the round-offs of floating-point arithmetics.
Follow the link
http://www.multiprecision.org/index.php?prog=mpc
In my business, I statically overload all the intrinsics functions (such as sqrt, exp...) with calls to the MPFR library. Thereby, I can build a code that behaves EXACTLY the same on all platforms I used so far (alpha, Itaniums, Xeon...). I unplug all that for the production release for performance reasons, but there it depends on the relative cost of the intrinsics function in your code.
Sincerely,
Olivier.

Olivier_C_ · ‎09-22-2009

Quoting - cessenat@free.fr

If you do not mind performance too much, or if you only want to make sure your code is not buggy when you change platform -compiler release, processor family such as IA64 versus x86_64 - I advise you the MPC library based on MPFR, itself based on GMP for a clean normalized by the IEEE standards implementation of the round-offs of floating-point arithmetics.
Follow the link
http://www.multiprecision.org/index.php?prog=mpc
In my business, I statically overload all the intrinsics functions (such as sqrt, exp...) with calls to the MPFR library. Thereby, I can build a code that behaves EXACTLY the same on all platforms I used so far (alpha, Itaniums, Xeon...). I unplug all that for the production release for performance reasons, but there it depends on the relative cost of the intrinsics function in your code.
Sincerely,
Olivier.

If it helps, I send you a program that implements the complex square root on the command line.
Detailed explanations are provided in the program.
Olivier.