IFX with /O2 processes some bit operations incorrectly

mecej4 · ‎01-15-2024

I ran the current version of IFX (Version 2024.0.2 Build 20231213) on Marsaglia's 2009 SuperKiss32 RNG test program. With other compilers and with ifx /Od or ifort /O2 or ifort /Od, the output is

S:\ALGO\packmol\src>superkiss32
 Does x = 1809478889 ?
      x = 1809478889

With ifx /O2, the test fails, with the output

S:\ALGO\packmol\src>superkiss32
 Does x = 1809478889 ?
      x = 1410402086

Now, the test program generates a billion random integers before printing out the expected and computed values of the billionth random integer, so it is not easy to investigate the error with Marsaglia's original program, but here is a short reproducer that I extracted from it.

program ifxbits
   implicit none
   integer i, q(2), carry, h, z

   q = [1713231091, 1290478956]
   carry =       362
   do i = 1,2
      h = iand(carry, 1)
      z = ishft(ishft(q(i),9), -1) + &
          ishft(ishft(q(i),7), -1) + &
          ishft(carry,         -1)         ! Integer overflow possible, to be overlooked
      print *,' Before updating carry, q(',i,'), z = ',q(i),z
      carry = ishft(q(i), -23) + ishft(q(i), -25) + ishft(z, -31)
      q(i)  = not(ishft(z, 1) + h)
      print '(4x,i2,4i12)',i,q(i),h,z,carry
   end do
end program

The expected output is

S:\ALGO\packmol\src>ifxbits
  Before updating carry, q(           1 ), z =   1713231091   625619061
     1 -1251238123           0   625619061         255
  Before updating carry, q(           2 ), z =   1290478956 -1511078017
     2 -1272811264           1 -1511078017         192

and IFX with /Od gives the same output. With /O2, however, IFX gives

  Before updating carry, q(           1 ), z =   1713231091   625619061
     1 -1251238123           0   625619061         255
  Before updating carry, q(           2 ), z =   1290478956 -1511078017
     2 -1272811264           1 -1511078017         191

Note "191" is output instead of "192". From then on, as you may expect in an RNG, the errors cascade.

For the convenience of anyone who wishes to run Marsaglia's program, I attach the source file.

JohnNichols · ‎01-15-2024

I finally found a good Gaussian random number generator.

Take a supersensitive accelerometer, says a ST.COM MKI version which can be read using serial port Fortran, place a small timber board in a quiet location, record the thermal vibration that will range to 3 milli-g with an accuracy of 60 micro g and then use the output to generate a set of random numbers. The only minor issue is that thermal is not truly Gaussian, it is close but after several tens of millions of measurements it is not perfectly Gaussian. Cost about 200 USD.

One simple way is to determine the Z score of the output. Very useful data.

mecej4 · ‎01-15-2024

It appears upon further examination that with /O2 IFX is able to precompute the results at compile time, so we do not see any shrl instructions in the OBJ file. This implies, unfortunately, that fixing the optimization bug in the trivial reproducer is probably not going to indicate the fixes needed for the related bugs that IFX puts into the Marsaglia test EXE.

hakostra1 · ‎01-16-2024

I ran the example with the NAG compiler, and it seems that it printed the expected "192" for every optimization level O0 to O4.

I ran the example with the GFortran compiler, and it seems that it printed "192" for optimization level O0, and "191" for level 01 to 03.

JohnNichols · ‎01-16-2024

I do not think this going to be an easy fix for IFX, there is something fundamentally wrong in their algorithm, it is to basic to the Fortran methods to be unfixed.

hakostra1 · ‎01-16-2024

Also, 'ifort' seems to print "192" for all optimization levels.

JohnNichols · ‎01-16-2024

I broke the equation down to fundamental operators - one at a time and it still give an error on dp(1) although it fixed dp(3).

We all known the true answer to all questions is 42.

Barbara_P_Intel · ‎01-16-2024

Thanks, @mecej4, for reporting this. I filed a bug report on the reproducer, CMPLRLLVM-55293.

I understand that it may not be a "true" reproducer, but it still prints the wrong answers. It prints the wrong answer on Linux, too.

Now to untangle superkiss32.f90 a bit.

Barbara_P_Intel · ‎01-16-2024

I filed a bug report, CMPLRLLVM-55306, against superkiss32.f90, too. Let's leave it to the compiler internals experts to decide if the two issues are related or not.

Barbara_P_Intel · ‎04-09-2024

Look for ifx 2024.2.0 for the fix for ifxbits.f90. But you'll need to use a new compiler option to get the expected answers when compiling with -O1 and above.

$ ifx -what -O1 -fno-strict-overflow ifxbits.f90
 Intel(R) Fortran 24.0-1662
$ a.out
  Before updating carry, q(           1 ), z =   1713231091   625619061
     1 -1251238123           0   625619061         255
  Before updating carry, q(           2 ), z =   1290478956 -1511078017
     2 -1272811264           1 -1511078017         192

Barbara_P_Intel · ‎04-09-2024

And for the superkiss32.f90 reproducer use that same new compiler option, -fno-strict-overflow.

This will be available in ifx 2024.2 that will be available mid-2024.

$ ifx -what -O1 -fno-strict-overflow superkiss32.f90
 Intel(R) Fortran 24.0-1662
$ a.out
 Does x = 1809478889 ?
      x = 1809478889

Barbara_P_Intel · ‎04-10-2024

There's a description of -fstrict-overflow in the ifort to ifx Porting Guide that explains how the behavior of ifx differs from ifort regarding integer overflow.