- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That FP arithmetic can be tricky in multi-threaded environment is well known. However, I am getting strange results depending on whether I use -O2 optimization or not, even though the code itself is serial. In particular, in the enclosed sample code, four complex arrays, z1, z2, z3, and z4, are initialized with random numbers so that their value ranges are pairwise at different scales. Then the operation z1*z2 + z3*z4 performed elementwise and stored in an array z. Finally sum(z) is computed. If I compile with -O2 flag, the result depends on whether the operands are in the above order, or like: z2*z1 + z4*z3 (controlled at runtime by a command line argument). If I compile with -g, or with -fp-model precise, the two results are identical. Can anyone explain why this could be happening?
Thanks,
George
Thanks,
George
Link Copied
25 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A one-liner answer is: "Ahem, random numbers are random!"
According to the Fortran standard, the sequence of random numbers you generate is processor dependent.
The Intel Fortran Reference manual states in the entry for RANDOM_NUMBER:
"If RANDOM_SEED is not used, the processor sets the seed for RANDOM_NUMBER to a processor-dependent value."
Since you are computing SUM( z1.z2 + z3.z4) and SUM(z2.z1 + z4.z3) in separate runs, you may be using random sequences generated from different seeds. Thus, you end up comparing two numbers (the sums) which are also random numbers, albeit with much smaller standard deviations.
If you want to make a valid comparison test for commutativity, you should explicitly do CALL RANDOM_SEED, and with the same initial set of seeds.
Or, to avoid having to set the seeds, in your program declare za and zb instead of z, and compute and print as follows.
za = z1*z2 + z3*z4
zb = z2*z1 + z4*z3
print *, sum(za)
print *, sum(zb)
According to the Fortran standard, the sequence of random numbers you generate is processor dependent.
The Intel Fortran Reference manual states in the entry for RANDOM_NUMBER:
"If RANDOM_SEED is not used, the processor sets the seed for RANDOM_NUMBER to a processor-dependent value."
Since you are computing SUM( z1.z2 + z3.z4) and SUM(z2.z1 + z4.z3) in separate runs, you may be using random sequences generated from different seeds. Thus, you end up comparing two numbers (the sums) which are also random numbers, albeit with much smaller standard deviations.
If you want to make a valid comparison test for commutativity, you should explicitly do CALL RANDOM_SEED, and with the same initial set of seeds.
Or, to avoid having to set the seeds, in your program declare za and zb instead of z, and compute and print as follows.
za = z1*z2 + z3*z4
zb = z2*z1 + z4*z3
print *, sum(za)
print *, sum(zb)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I disagree about the random numbers. If you do not call RANDOM_SEED, the seed is set at compile time and will always generate the same random numbers in the same order every time you call the program. If you recompile, then the random numbers change. For example:
And then I run it:
This is discussed here:
http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/
Tim
[fortran]PROGRAM RandTest IMPLICIT NONE REAL :: randNum INTEGER :: I DO I = 1, 10 CALL RANDOM_NUMBER(randNum) PRINT *, randNum END DO END PROGRAM RandTest [/fortran]
And then I run it:
[bash]|tgallagher@harpy:test|> ./a.out 3.9208680E-07 2.5480442E-02 0.3525161 0.6669145 0.9630555 0.8382882 0.3353550 0.9153272 0.7958636 0.8326931 |tgallagher@harpy:test|> ./a.out 3.9208680E-07 2.5480442E-02 0.3525161 0.6669145 0.9630555 0.8382882 0.3353550 0.9153272 0.7958636 0.8326931 |tgallagher@harpy:test|> ./a.out 3.9208680E-07 2.5480442E-02 0.3525161 0.6669145 0.9630555 0.8382882 0.3353550 0.9153272 0.7958636 0.8326931 [/bash]As you can see, all the numbers are the same and in the same order because the seed is set at compile time. So, provided the code isn't recompiled, the differences in the sum's is due to floating point math consistency and not due to different random numbers.
This is discussed here:
http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I have to see the magnitudes of the differences that the OP alluded to. I stand by my position that one cannot count on how a program will behave when the Fortran Standard says it is processor-dependent.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The way our implementation works is that if you do not call RANDOM_SEED, the seed chosen is a fixed value. If you call RANDOM_SEED with no arguments, a seed based on the time-of-day clock is used.
That said, the order of operations, and vectorization, can slightly change values. Floating point arithmetic operations are often not computationally communtative.
That said, the order of operations, and vectorization, can slightly change values. Floating point arithmetic operations are often not computationally communtative.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You may be interested in checking whether both loops are vectorized, and whether all arrays are aligned, and then treated by the compiler as aligned. In principle, the recent compilers should take care to produce same results regardless of alignment on this kind of loop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here are the results from one run, in which the same z1 and z2 arrays were used to compute za and zb using vector assignments rather than loops, as in #1:
za: (1.786555654004733E-007,1.688695763736762E-007)
zb: (1.786555654004733E-007,1.688695763736758E-007)
There is a discrepancy of about 2 m in the imaginary part, and none in the real part.
Considering that 2.6 million complex numbers were added, the evaluation of each of which involved 2 X ( 4 multiplications + 3 additions ), I think that the discrepancy is negligibly small, and that fact is not surprising either; because the random numbers used are uniformly generated between (-a,a) and (-b, b), one need not fear accumulation of round-off error.
I have deliberately omitted mention of the compiler version, options, etc. I think that I need to be convinced that there is a problem to be solved.
I also ask whether we should not be more worried about the value itself (about 2.10-7) being unacceptably large, given that the expected value is zero and the computed value is over 1000 times the range of magnitudes of za, zb.
za: (1.786555654004733E-007,1.688695763736762E-007)
zb: (1.786555654004733E-007,1.688695763736758E-007)
There is a discrepancy of about 2 m in the imaginary part, and none in the real part.
Considering that 2.6 million complex numbers were added, the evaluation of each of which involved 2 X ( 4 multiplications + 3 additions ), I think that the discrepancy is negligibly small, and that fact is not surprising either; because the random numbers used are uniformly generated between (-a,a) and (-b, b), one need not fear accumulation of round-off error.
I have deliberately omitted mention of the compiler version, options, etc. I think that I need to be convinced that there is a problem to be solved.
I also ask whether we should not be more worried about the value itself (about 2.10-7) being unacceptably large, given that the expected value is zero and the computed value is over 1000 times the range of magnitudes of za, zb.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve, I have two related comments/questions. One is that the documentation of RANDOM_SEED at
http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/fortran/lin/lref_for/source_files/rfrandms.htm
says, in the entry for RANDOM_SEED that "...If no argument is specified, a random number based on the date and time is assigned to the seed." This does not quite agree with the first line of output from the program
http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/fortran/lin/lref_for/source_files/rfrandms.htm
says, in the entry for RANDOM_SEED that "...If no argument is specified, a random number based on the date and time is assigned to the seed." This does not quite agree with the first line of output from the program
[fortran]program getseedThe first line did not change even after I changed the PC clock to a different date.
integer :: n,seed(10)
call random_seed(size=n)
if(n.gt.10)stop 'seed array too small'
call random_seed(get=seed)
write(*,10)'Before PUT : ',seed(1:n)
seed(1:n)=seed(1:n)*2
call random_seed(put=seed)
call random_seed(get=seed)
write(*,10)'After PUT : ',seed(1:n)
10 format(A,10(2x,Z11))
end program getseed
[/fortran]
[bash]Before PUT : 7FFFFFAA 7FFFFF06The second comment is that I did not expect the second line to be the same as the first. With other compilers, I do see that the second line has values twice those of the first (with possible overflow ignored)
After PUT : 7FFFFFAA 7FFFFF06
[/bash]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So I can confirm that your example gives the same seeds/random numbers. But we cooked up an example that generates a difference:
Ironically, the random numbers/seeds are different on Intel but when we compile and run with gfortran, the seeds and random numbers are always the same before and after the second call to IntializeRandomSeed.
Tim
[fortran]MODULE Random_m IMPLICIT NONE PRIVATE PUBLIC :: InitializeRandomSeed CONTAINS SUBROUTINE InitializeRandomSeed(seed_input) INTEGER, INTENT(IN) :: seed_input INTEGER :: i, n, clock INTEGER, DIMENSION(:), ALLOCATABLE :: seed CALL RANDOM_SEED(SIZE=n) ALLOCATE(seed(n)) IF (seed_input > 0) THEN seed = seed_input ELSE CALL SYSTEM_CLOCK(COUNT=clock) seed = clock + 37 * (/ (i - 1, i = 1, n) /) END IF CALL RANDOM_SEED(PUT=seed) CALL RANDOM_SEED(GET=seed) PRINT *, seed DEALLOCATE(seed) END SUBROUTINE InitializeRandomSeed END MODULE Random_m PROGRAM TestRandom USE Random_m INTEGER :: i, n, rsInput REAL :: x n = 10 rsInput = -1 CALL InitializeRandomSeed(rsInput) DO i = 1, n CALL RANDOM_NUMBER(x) WRITE(*,*) 'x = ', x END DO WRITE(*,*) 'AGAIN' CALL InitializeRandomSeed(rsInput) DO i = 1, n CALL RANDOM_NUMBER(x) WRITE(*,*) 'x = ', x END DO END PROGRAM TestRandom [/fortran]
Ironically, the random numbers/seeds are different on Intel but when we compile and run with gfortran, the seeds and random numbers are always the same before and after the second call to IntializeRandomSeed.
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a shot in the dark: perhaps the Intel RNG needs to be primed by having it generate a few output numbers before it actually uses a new seed given to it with a PUT request.
GFortran (I have V 4.5) uses an 8-element seed vector. Your example in #8 sends it a single value, which is spread to the whole vector by the statement
It would not be surprising if GFortran ignored a PUT request with an incomplete/degenerate seed vector.
GFortran (I have V 4.5) uses an 8-element seed vector. Your example in #8 sends it a single value, which is spread to the whole vector by the statement
[fortran] IF (seed_input > 0) THEN seed = seed_input [/fortran]
It would not be surprising if GFortran ignored a PUT request with an incomplete/degenerate seed vector.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
mecej4, I don't see that your program called RANDOM_SEED with no arguments. But I agree that the behavior is unexpected and I will look into that. I modified the program to call RANDOM_SEED with no arguments and was puzzled to see that I got the same seed back each time - didn't expect that either. I'll have a word or three with the library developers tomorrow about that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve,
> I don't see that your program called RANDOM_SEED with no arguments.
Quite so. I assumed that if RANDOM_NUMBER was called with no preceding call to RANDOM_SEED in the current process, the state of the RNG would be the same as if an implicit call to RANDOM_NUMBER had been made with no arguments.
Thanks for looking into this.
> I don't see that your program called RANDOM_SEED with no arguments.
Quite so. I assumed that if RANDOM_NUMBER was called with no preceding call to RANDOM_SEED in the current process, the state of the RNG would be the same as if an implicit call to RANDOM_NUMBER had been made with no arguments.
Thanks for looking into this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So now I'm more confused.
The example calls the Initialize routine with -1 always, so it goes into the ELSE clause that uses SYSTEM_CLOCK.
So I modified it so that the seed is no longer degenerate by:
Tim
The example calls the Initialize routine with -1 always, so it goes into the ELSE clause that uses SYSTEM_CLOCK.
So I modified it so that the seed is no longer degenerate by:
[fortran] IF (seed_input > 0) THEN seed = seed_input ELSE DO I = 1, n CALL SYSTEM_CLOCK(COUNT=clock) seed(I) = clock*(I-1) END DO END IF[/fortran]With that change, gfortran generates different random numbers but now this example with Intel generates the same random numbers and the same seed after the PUT.
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
George,
The problem you are observing in apparent non-commutativity is your additions are saturating the mantissa of the floating point numbers. This means you are experiencing rounding errors at different points in the calculations dependent on the sequence of operations.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>If I compile with -g, or with -fp-model precise, the two results are identical
With those options, the generated code uses FPU instructions as opposed to SSE instructions. FPU internal TEMP (FPU stack) are 80-bit (greater precision than double). Therefore you will see less round off error (no apparent error in many more cases). Rounding errors are a fact of life with some sequences of arithmatic.
Jim Dempsey
With those options, the generated code uses FPU instructions as opposed to SSE instructions. FPU internal TEMP (FPU stack) are 80-bit (greater precision than double). Therefore you will see less round off error (no apparent error in many more cases). Rounding errors are a fact of life with some sequences of arithmatic.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The "feature" of changing from SSE to X87 at -O0 (implied by -g, when no -O level is set) is present only in 11.1 and earlier compilers. As I understand it, the 12.0 compilers generate X87 code only when explicitly requested (by the 32-bit -mia32 option).
-fp-model precise (same as -fp-model source, for ifort, but not for icc) by itself doesn't change from SSE to X87 instructions. It does restore gradual underflow and strict compliance in the C as well as Fortran sense with the order of operations implied in your source code.
In the 12.0 compilers, the new option -standard-semantics is available to comply more fully with Fortran standards, apparently without setting gradual underflow. I haven't checked whether it disables optimization of sum and dot product reductions, as -fp-model precise|source do.
-fp-model precise (same as -fp-model source, for ifort, but not for icc) by itself doesn't change from SSE to X87 instructions. It does restore gradual underflow and strict compliance in the C as well as Fortran sense with the order of operations implied in your source code.
In the 12.0 compilers, the new option -standard-semantics is available to comply more fully with Fortran standards, apparently without setting gradual underflow. I haven't checked whether it disables optimization of sum and dot product reductions, as -fp-model precise|source do.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've got some interesting info regarding RANDOM_SEED. It seems that if you do a PUT with a seed that has negative values, it ignores the PUT. At least this is the way it looks to me. More investigation is needed.
Calling RANDOM_SEED with no arguments does use a time-based seed and PUT does work otherwise. I have attached a program which demonstrates this.
Calling RANDOM_SEED with no arguments does use a time-based seed and PUT does work otherwise. I have attached a program which demonstrates this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, mystery solved. I was clued in by the following text in the Fortran standard:
The value returned by GET
need not be the same as the value specied by PUT in an immediately preceding reference to RANDOM SEED.
For example, following execution of the statements
CALL RANDOM_SEED (PUT=SEED1)
CALL RANDOM_SEED (GET=SEED2)
SEED2 need not equal SEED1. When the values dier, the use of either value as the PUT argument in a
subsequent call to RANDOM SEED shall result in the same sequence of pseudorandom numbers being generated.
For example, after execution of the statements
CALL RANDOM_SEED (PUT=SEED1)
CALL RANDOM_SEED (GET=SEED2)
CALL RANDOM_NUMBER (X1)
CALL RANDOM_SEED (PUT=SEED2)
CALL RANDOM_NUMBER (X2)
X2 equals X1.
In the test programs here, the static initial seed was retrieved, multiplied by 2, and then PUT, but when GET was done, the "old value" was retrieved. I found that BOTH values in fact delivered the same RANDOM_NUMBER harvest. So RANDOM_SEED is doing some "filtering" of the incoming seed value to make sure it is within proper limits for the algorithm. No bug.
The value returned by GET
need not be the same as the value specied by PUT in an immediately preceding reference to RANDOM SEED.
For example, following execution of the statements
CALL RANDOM_SEED (PUT=SEED1)
CALL RANDOM_SEED (GET=SEED2)
SEED2 need not equal SEED1. When the values dier, the use of either value as the PUT argument in a
subsequent call to RANDOM SEED shall result in the same sequence of pseudorandom numbers being generated.
For example, after execution of the statements
CALL RANDOM_SEED (PUT=SEED1)
CALL RANDOM_SEED (GET=SEED2)
CALL RANDOM_NUMBER (X1)
CALL RANDOM_SEED (PUT=SEED2)
CALL RANDOM_NUMBER (X2)
X2 equals X1.
In the test programs here, the static initial seed was retrieved, multiplied by 2, and then PUT, but when GET was done, the "old value" was retrieved. I found that BOTH values in fact delivered the same RANDOM_NUMBER harvest. So RANDOM_SEED is doing some "filtering" of the incoming seed value to make sure it is within proper limits for the algorithm. No bug.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks to Jim Dempsey and TimP for explaining the effect of fp-model and -O2 compiler options on the swtich to/from SSE instruction set. It looks like the only thing that makes a difference in this case is the fact that the FPU uses an 80-bit register for intermediate results. What is still puzlling is why, as it turns out, the difference appears only in the imaginary part?
The discussion on the random number generation is very interesting, but irrelevant for this problem (perhaps move to a new thread?). One can see the result of the non-commutative operations by using, for instance, the following (very deterministic) constants:
a = dcmplx(1.102323407845943d-10, -2.132323407845943d-10)
b = dcmplx(-2.392085985049895d-3, 3.3989085985049895d-3)
c = dcmplx(6.293823948579384d-10, 8.291823948579384d-10)
d = dcmplx(8.129834587968582d-3, 1.1298342879685822d-3)
The code output for a*b + c*d is:
(4.641008061166196E-012,8.336953268281271E-012)
while the output for b*a + d*c is
(4.641008061166196E-012,8.336953268281273E-012)
The reason I used randomly generated arrays was to save myself the trouble of finding specific quadrupules of complex numbers with this non-commutative property (it's not that easy to do manually). The sum of all elements is simply used as a scalar diagnostic, although it most likely contributes further to the discrepancy because of differences in the order of addition between SSE and FPU. In a modified version of the test code, I've done pairwise comparison of the elements of the two z arrays and it turns out that about 2/3 are different.
--George
The discussion on the random number generation is very interesting, but irrelevant for this problem (perhaps move to a new thread?). One can see the result of the non-commutative operations by using, for instance, the following (very deterministic) constants:
a = dcmplx(1.102323407845943d-10, -2.132323407845943d-10)
b = dcmplx(-2.392085985049895d-3, 3.3989085985049895d-3)
c = dcmplx(6.293823948579384d-10, 8.291823948579384d-10)
d = dcmplx(8.129834587968582d-3, 1.1298342879685822d-3)
The code output for a*b + c*d is:
(4.641008061166196E-012,8.336953268281271E-012)
while the output for b*a + d*c is
(4.641008061166196E-012,8.336953268281273E-012)
The reason I used randomly generated arrays was to save myself the trouble of finding specific quadrupules of complex numbers with this non-commutative property (it's not that easy to do manually). The sum of all elements is simply used as a scalar diagnostic, although it most likely contributes further to the discrepancy because of differences in the order of addition between SSE and FPU. In a modified version of the test code, I've done pairwise comparison of the elements of the two z arrays and it turns out that about 2/3 are different.
--George
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> What is still puzlling is why, as it turns out, the difference appears only in the imaginary part.
It is easy to see that the perceived difference is an artifact of binary->decimal conversion, at least for the scalar example that you showed in #18.
If you are going to compare 64-bit reals to the last bit, please print using hexadecimal notation.
You may wish to explore whether the same explanation applies to your original test with big arrays of random elements.
Note, added subsequently:
The above program was in error, and I offer my apologies. Here is a corrected version.
It is easy to see that the perceived difference is an artifact of binary->decimal conversion, at least for the scalar example that you showed in #18.
If you are going to compare 64-bit reals to the last bit, please print using hexadecimal notation.
[fxfortran] program cmgives the result
double complex a,b,c,d,z1,z2
integer iz1(2 4),iz2(2 4)
equivalence (iz1,z1),(iz2,z2)
a = dcmplx(1.102323407845943d-10, -2.132323407845943d-10)
b = dcmplx(-2.392085985049895d-3, 3.3989085985049895d-3)
c = dcmplx(6.293823948579384d-10, 8.291823948579384d-10)
d = dcmplx(8.129834587968582d-3, 1.1298342879685822d-3)
z1=a*b+c*d
z2=b*a+d*c
write(*,*)z1
write(*,*)z2
write(*,20)iz1(2),iz1(1),iz1(4),iz1(3) (underlined part added later)
write(*,20)iz2(2),iz2(1),iz2(4),iz2(3)
20 format(1x,2Z8,2x,2Z8)
end
[/fxfortran]
[bash] (4.641008061166196E-012,8.336953268281273E-012)GFortran seems to do the formatting slightly better:
(4.641008061166196E-012,8.336953268281271E-012)
3D94694F7FCB1DC7 (this is the real part; see below)
3D94694F7FCB1DC7
[/bash]
[bash] ( 4.64100806116619663E-012, 8.33695326828127092E-012)
( 4.64100806116619663E-012, 8.33695326828127092E-012)
3D94694F7FCB1DC8
3D94694F7FCB1DC8
[/bash]
You may wish to explore whether the same explanation applies to your original test with big arrays of random elements.
Note, added subsequently:
The above program was in error, and I offer my apologies. Here is a corrected version.
[fxfortran] program cmThe results from the corrected program, compiled using the 12.0.2 compiler, with /Od, show that all bits of z1 and z2 agree. With /O2, there is a difference in the least significant bit of the imaginary part.
double complex a,b,c,d,z1,z2
integer iz1(4),iz2(4)
equivalence (iz1,z1),(iz2,z2)
a = dcmplx(1.102323407845943d-10, -2.132323407845943d-10)
b = dcmplx(-2.392085985049895d-3, 3.3989085985049895d-3)
c = dcmplx(6.293823948579384d-10, 8.291823948579384d-10)
d = dcmplx(8.129834587968582d-3, 1.1298342879685822d-3)
z1=a*b+c*d
z2=b*a+d*c
write(*,*)z1
write(*,*)z2
write(*,20)iz1(2),iz1(1),iz1(4),iz1(3)
write(*,20)iz2(2),iz2(1),iz2(4),iz2(3)
20 format(1x,2Z8,2x,2Z8)
end[/fxfortran]
[bash] (4.641008061166196E-012,8.336953268281273E-012) (4.641008061166196E-012,8.336953268281271E-012) 3D94694F7FCB1DC7 3DA255499696C399 3D94694F7FCB1DC7 3DA255499696C398Since a*b is about a tenth of c*d, it is not surprising that adding them causes a loss of 1 bit of precision or that the compiler optimization level has a small but noticeable effect.
[/bash]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The strictest criteria offered by IEEE754 for WRITE conversions apply only down to 1e-10, but it's a little surprising to see this non-repeatability come in at this point, appearing to indicate an off-by-1-ULP in the first case produced by ifort.
![](/skins/images/3344F5B3B76C91485ED0E980FD0CA95E/responsive_peak/images/icon_anonymous_message.png)
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page