Solved: Re: Issue with IFX Conditional Evaluation Under Specific Conditions

ShinyaHattori · ‎06-10-2024

I have encountered an issue within a subroutine where, after performing operations that exceed the maximum value of an integer variable received as an argument, the subsequent if-statement to determine the sign of that variable returns incorrect results.

This phenomenon occurs in IFX’s Release mode (with Optimization set to /O1, /O2, /O3) but does not occur in Debug mode (with Optimization set to /Od).

Here are the details of my environment:

Intel® Fortran Compiler 2024.1.0 [Intel® 64]

Intel® Fortran Compiler for applications running on Intel® 64, version 2024.1.0 Package ID: w_oneAPI_2024.1.0.964

OS: Windows 11

Command Line (All Options): /nologo /O2 /module:“x64\Release\” /object:“x64\Release\” /libs:dll /threads /c

Below is the code for verification purposes. The Check3 function returns a different result.

      program NumberSignChecker 
      implicit none
      integer :: ix
      
      ix = 228546496
      ix = IX*48828125
      write(*,*) "ix",ix
      if(ix<0)then
          write(*,*) ix ,"is a negative number.(main)"
      else
          write(*,*) ix , "is a positive number or 0.(main)"
      endif
      
      call Check1
      call Check2
      ix = 228546496
      call Check3(ix)

      end program NumberSignChecker
!--------------------------------------------------
      SUBROUTINE  Check1
      implicit none
      integer :: ix
      
      ix = 228546496
      ix = ix*48828125
      write(*,*) "ix",ix
      if(ix<0)then
         write(*,*) ix ,"is a negative number.(Check 1)"
      else
          write(*,*) ix , "is a positive number or 0.(Check 1)"
      endif
      RETURN
      END
!--------------------------------------------------
      SUBROUTINE  Check2
      implicit none
      integer :: ix
      ix = -686079808
      write(*,*) "ix",ix
      if(ix<0)then
          write(*,*) ix ,"is a negative number.(Check 2)"
      else
          write(*,*) ix , "is a positive number or 0.(Check 2)"
      endif
      RETURN
      END
!--------------------------------------------------
      SUBROUTINE  Check3(ix)
      implicit none
      integer,intent(inout) :: ix
      
      ix = ix*48828125
      write(*,*) "ix",ix
      if(ix<0)then
         write(*,*) ix ,"is a negative number.(Check 3)"
      else
          write(*,*) ix , "is a positive number or 0.(Check 3)"
      endif
      RETURN
      END

Result:

ix -686079808
-686079808 is a negative number.(main)
ix -686079808
-686079808 is a negative number.(Check 1)
ix -686079808
-686079808 is a negative number.(Check 2)
ix -686079808
-686079808 is a positive number or 0.(Check 3)

Ron_Green · ‎06-11-2024

@andrew_4619 is correct: the Standard does not allow integers to overflow. Your code is non-conformant and hence the behavior is undefined. That ifx behavior is different that ifort is not a bug. That said, you are the 2nd customer with old code that makes a bad assumption of what a compiler "should do" with bad code. That aside, we understand you want to continue to use this sort of code that worked with ifort with ifx and have a compiler option planned for you in the next Update release.

For the upcoming 2024.2.0 update we added a new option to accommodate these sorts of codes. Again, can't use it in 2024.1.x or older, only in the future 2024.2.0 and beyond. -fno-strict-overflow. It's in our Porting Guide already, and here is the description

-fstrict-overflow (Linux) /Qstrict-overflow (Windows)

Integer overflow in arithmetic expressions is not permitted by the Fortran standard. However, some legacy programs rely on integer overflow. When overflow occurs, as much of the value as can fit into the result is assigned, and may also result in a change in the sign bit.

By default, the ifx compiler assumes integer arithmetic does not overflow. ifx defaults to -fstrict-overflow (Linux) or /Qstrict-overflow (Windows). When strict-overflow is enabled the compiler assumes that integer operations can never overflow, which allows for better optimizations. But, if overflow does occur, the resulting behavior is undefined and this behavior may not be compatible with the default ifort behavior.

ifort allowed integer overflow. Therefore, programs that rely on the ifort behavior for integer overflow should use the -fno-strict-overflow (Linux) or /Qstrict-overflow- (Windows) with ifx. The -fno-strict-overflow (linux) or /Qstrict-overflow- (Windows) for ifx allows to compiler to assume intergers may overflow and are allowed to overflow. Allowing overflow in ifx may result in less optimized code.

And the DEveloper Guide and Reference doc will get an update in 2024.2.0 for this option

Indicates the compiler can assume integer arithmetic does not overflow. This feature is only available with ifx.

Syntax

Linux:

-fstrict-overflow

-no-strict-overflow

Windows:

/Qstrict-overflow

/Qstrict-overflow-

Default

strict-overflow

The compiler assumes integer arithmetic does not overflow.

Description

Integer overflow is a condition in which an arithmetic operation on two integer values results in a value that is too large to be represented in an integer of the size specified, e.g. two 32-bit integers added together result in a value that does not fit into 32 bits. When overflow occurs, as much of the value as can fit into the result is assigned, usually resulting in a negative number. Performing such arithmetic is not permitted by the Fortran standard, though some programs rely on this behavior. When strict-overflow is enabled the compiler assumes that integer operations can never overflow, which allows for better optimizations, but if overflow does occur, the resulting behavior is undefined. To make programs that rely on integer overflow work, the fno-strict-overflow (/Qstrict-overflow-) option must be specified, which permits overflow but may result in less optimized code.

IDE Equivalent

None

Parent topic: Optimization Options

View solution in original post

andrew_4619 · ‎06-11-2024

Integer overflow is not detected. If there is a critical risk of overflow in production code it is down to the programmer to make robust programming.

there was a good discussion in this thread https://community.intel.com/t5/Intel-Fortran-Compiler/How-best-to-handle-integer-overflow-situations/td-p/1088872

jimdempseyatthecove · ‎06-11-2024

@andrew_4619

(expected code) The product of the same two large numbers is expected to overflow and return a negative number.

All three tests (main, Check1, Check3), (are supposed to) use the same two integer numbers and produce the same product.

Check three correctly shows/writes the negative result in ix...

However, in check 3 the IF(IX<0) ... fails to use the new value of IX (as the branch for >=0 is taken).

This is a bug in the compiler.

@ShinyaHattori

In the main code, after the call the Check3, add

write(*,*) ix,"Return value from Check3"

The code optimization may see that you do not use the return value of ix, and therefore elided (as an optimization removed unnecessary code), and as a result, the original value of ix before call remained untouched. However, in the process, a bug was introduced.

If the additional write statement in main corrects the problem, then my above assumption is likely correct.

Jim Dempsey

andrew_4619 · ‎06-11-2024

@jimdempseyatthecove "The product of the same two large numbers is expected to overflow and return a negative number."

My understanding is "The Fortran standard does not specify the behaviour of program during integer overflow, so it depends on compiler implementation. Intel Fortran compiler does not have integer overflow detection. "

I am not claiming deep knowledge, but if the standards do not define something ones expectations can be disappointed. I will now wait to be shot down in flames..... That's Life!

jimdempseyatthecove · ‎06-11-2024

It is not unreasonable (unusual) to use integer math to produce hash codes and/or random numbers where the operations are intended to produce overflow (or rather expected to produce wraparound). Where the resultant value, or some bit section, is used as a resultant key/rnd.

! produce next seed (with possible overflow/wrap around)
Seed = ior(Seed,1) * BigPrime ! Assure seed is odd and non-zero
rnd = IBITS(Seed, 8, 16) ! next 16-bit rnd

Code like that is not unusual.

Jim Dempsey

Ron_Green · ‎06-11-2024

@andrew_4619 is correct: the Standard does not allow integers to overflow. Your code is non-conformant and hence the behavior is undefined. That ifx behavior is different that ifort is not a bug. That said, you are the 2nd customer with old code that makes a bad assumption of what a compiler "should do" with bad code. That aside, we understand you want to continue to use this sort of code that worked with ifort with ifx and have a compiler option planned for you in the next Update release.

For the upcoming 2024.2.0 update we added a new option to accommodate these sorts of codes. Again, can't use it in 2024.1.x or older, only in the future 2024.2.0 and beyond. -fno-strict-overflow. It's in our Porting Guide already, and here is the description

-fstrict-overflow (Linux) /Qstrict-overflow (Windows)

Integer overflow in arithmetic expressions is not permitted by the Fortran standard. However, some legacy programs rely on integer overflow. When overflow occurs, as much of the value as can fit into the result is assigned, and may also result in a change in the sign bit.

By default, the ifx compiler assumes integer arithmetic does not overflow. ifx defaults to -fstrict-overflow (Linux) or /Qstrict-overflow (Windows). When strict-overflow is enabled the compiler assumes that integer operations can never overflow, which allows for better optimizations. But, if overflow does occur, the resulting behavior is undefined and this behavior may not be compatible with the default ifort behavior.

ifort allowed integer overflow. Therefore, programs that rely on the ifort behavior for integer overflow should use the -fno-strict-overflow (Linux) or /Qstrict-overflow- (Windows) with ifx. The -fno-strict-overflow (linux) or /Qstrict-overflow- (Windows) for ifx allows to compiler to assume intergers may overflow and are allowed to overflow. Allowing overflow in ifx may result in less optimized code.

And the DEveloper Guide and Reference doc will get an update in 2024.2.0 for this option

Indicates the compiler can assume integer arithmetic does not overflow. This feature is only available with ifx.

Syntax

Linux:

-fstrict-overflow

-no-strict-overflow

Windows:

/Qstrict-overflow

/Qstrict-overflow-

Default

strict-overflow

The compiler assumes integer arithmetic does not overflow.

Description

Integer overflow is a condition in which an arithmetic operation on two integer values results in a value that is too large to be represented in an integer of the size specified, e.g. two 32-bit integers added together result in a value that does not fit into 32 bits. When overflow occurs, as much of the value as can fit into the result is assigned, usually resulting in a negative number. Performing such arithmetic is not permitted by the Fortran standard, though some programs rely on this behavior. When strict-overflow is enabled the compiler assumes that integer operations can never overflow, which allows for better optimizations, but if overflow does occur, the resulting behavior is undefined. To make programs that rely on integer overflow work, the fno-strict-overflow (/Qstrict-overflow-) option must be specified, which permits overflow but may result in less optimized code.

IDE Equivalent

None

Parent topic: Optimization Options

ShinyaHattori · ‎06-11-2024

@Ron_Green

Thank you for your detailed explanation.

I learned about the Standard does not allow integers to overflow. In the legacy code, integer overflow was used in the random number generation. I will replace it with another function. I will also make sure to go through the porting guide. I am looking forward to the updates after version 2024.2.0.

jimdempseyatthecove · ‎06-12-2024

FWIW, what was the result after you inserted the

write(*,*) ix,"Return value from Check3"

@Ron_Green

If the use ix after the return from call to Check3, causes Check 3 to take the "correct" branch (Fortran standards about integer overflow not withstanding), then this is a clear indication that the code optimization is in error and should be looked at.

Note, the integer multiply as used by this program will (should) use the IMUL instruction and set the expected (overflowed) value in EAX (or destination register). The IMUL calculation places the product into an internal (CPU) temporary register with the upper half, non-zero in this case, thus the sign-extended 32-bit value in the destination register does not match the contents of the 64-bit temp register this results in the setting the CF and OF flags. (Carry and Overflow). However, unless the programmer supplied /Qstrict-overflow, the compiler optimization should have ignored the overflow and CF and OF flags, and then used the result from the destination register and placed that into ix (or it's temporary registered copy of ix).

This said, had the user included /Qstrict-overflow, he would expect all integer multiplies (and add/sub) to insert code to test the CF/OF flags and take appropriate action (report error or set result to +Max, -Max or -0) whatever the compiler implementation decides. With the added computational overhead.

Saying that the standard has this as undefined behavior, in this case, is not cause to ignore investigating this report.

Jim Dempsey

ShinyaHattori · ‎06-12-2024

@jimdempseyatthecove

I inserted the following code and obtained the results below.

call Check3(ix)
write(*,*) "ix",ix
if(ix<0)then
   write(*,*) ix ,"is a negative number.(main2)"
else
    write(*,*) ix , "is a positive number or 0.(main2)"
endif

ix -686079808
-686079808 is a negative number.(main)
ix -686079808
-686079808 is a negative number.(Check 1)
ix -686079808
-686079808 is a negative number.(Check 2)
ix -686079808
-686079808 is a positive number or 0.(Check 3)
ix -686079808
-686079808 is a negative number.(main2)

jimdempseyatthecove · ‎06-13-2024

This is a strong indication of bug in the code generation.

The correct (expected) value is returned to the caller, while memory or register referenced by the code in Check3 referencing what it thinks has the correct value for the dummy ix is the wrong register.

@Ron_Green This is a short reproducer and should be looked into by your developers.

If in your opinion, this is expected behavior, please explain?

Jim

Ron_Green · ‎06-14-2024

Jim,

I believe this is in the optimization passes. Specifically an LLVM optimization pass, not some opt or code gen from Intel. Now the pass may cause the backend to codegen bad code. but using the llvm opt-bisect-limit to find the phase can show you the pass causing the error. I have not tried this with the Windows ifx yet, but on linux I see the error on this pass:

ifx -O2 intoverboard.f90 -mllvm -opt-bisect-limit=591 ; ./a.out
ifx -O2 intoverboard.f90 -mllvm -opt-bisect-limit=592 ; ./a.out

opt 592 is where the ix is called 'positive'. before this, it's negative. Hopefully the windows passes are numbered the same, with the 2024.1.0 compiler. Here is the pass I find at fault

BISECT: running pass (592) InstCombinePass on check3_

InstCombinePass comes from LLVM. Nothing Intel. The pass number may differ on Windows, or any older versions of the compiler. Look for the number for InstCombinePass in the opt-bisect-limit output.

My team creates our ifx front-end. Anything in there we can get fixed. This is not such a case. The test is not conformant and has a llvm-blessed compiler option to allow the old behavior with the next update. Frankly, I would like to help here but I think my time is better spent on those things I have control over and can get fixed.

andrew_4619 · ‎06-14-2024

If my code does not give answers I like and relies on undefined behaviour of non-conforming code then the answer is to fix my code. There are still plenty of pukka compiler bugs to fix!

jimdempseyatthecove · ‎06-15-2024

@andrew_4619

While I agree with you regarding fixing non-conforming code....

From my experience, for 70 years or so, FORTRAN implementations did not support detecting/handling integer overflow. Now it apparently does as an option.

As to if this change in behavior is an oversight or intended behavior in the InstCombinePass from LLVM when integer overflow detection is .NOT. enabled, I cannot say. I do believe that this should be brought to the attention of the parties responsible.

I imagine that some codes expect that 70 years of behavior is continued, else their code will unexpectedly or unwittingly fail.

Ships running aground, spacecraft burning up, etc... that kind of thing.

Jim Dempsey

andrew_4619 · ‎06-17-2024

"I imagine that some codes expect that 70 years of behavior is continued". Imagine having a safety critical application and rebuilding with a new compiler on new hardware and not rigorously testing it! Also consider given you can test for integer overflows in the testing environment..... If you then found your old code was non-compliant but gave the correct answer do you fix it?

jimdempseyatthecove · ‎06-17-2024

If you can choose to test for integer overflow...

You can also choose to not test for integer overflow and accept the residual data. This is not bad code design when this is what you want.

If you are/must be aware of possible overflow in your design, then you explicitly test for it (either compiler option or defensive code).

I know of no CPU integer multiply or add or subtract that test for overflow/underflow before the operation is made.

mismash goes in mash comes out flags are set to indicate overflow/underflow/carry/zero/minus...

flags are not tested in the code when overflow... detection is not enabled.

To have code, that instructs the CPU to mismash in mash out and not return the mash out is problematic.

Jim Dempsey

andrew_4619 · ‎06-17-2024

It clearly is bad design IMO because it is specifically at odds with the Fortran standards and produces unreliable results as a consequence! I have understanding and some sympathy for your views on this but I don't think we will agree.

Steve_Lionel · ‎06-17-2024

I had the faint hope that the switch to LLVM would allow for the return of integer overflow detection, which was lost when Compaq/DEC Fortran became Intel Fortran. (The DEC GEM code generator offered integer overflow detection, Intel's IL0 did not.) Sadly, it seems that it's still lost.

jimdempseyatthecove · ‎06-17-2024

Do you have a comment as to if the OP's situation is a compiler bug (IMHO "compiler" is the frontend + backend/LLVM).

Jim Dempsey

Steve_Lionel · ‎06-18-2024

Ron's response covers it - the optimizer makes certain assumptions, in particular that your program doesn't violate rules of the language - dummy argument aliasing is another thing the optimizer makes assumptions about, and there's an option (-assume dummy_alias) telling it that the program might violate that rule. It is not a compiler bug.

jimdempseyatthecove · ‎06-18-2024

But this isn't a case of aliasing. The expected (overflowed) result is returned, the tested value in the IF statement (following the source statement performing the multiplication with overflowed result), is not testing the sign of this variable.

Had the user coded Check3 with:

      ix = ix*48828125
      if(ix<0)then

where the flags of the result of the multiplication are used, this would be processor dependent and compiler dependent.

However, for the Intel CPUs IMUL does not touch the SF (Sign Flag), and therefore the contents of the resultant product register (eax) should have been tested (TEST instruction) as opposed to using the SF. If this be the case (no intervening write), the code is wrong.

In the op's case, he had an intervening write statement, which exasperates the condition.

      ix = ix*48828125
      write(*,*) "ix",ix
      if(ix<0)then

In this case, prior to call to write, the resultant register should have been flushed to the dummy argument (and it was).

Then post write, the dummy should have been read/tested.

*** However, the behavior of the program was as if the register of the registered result of the product (now corrupted by the call to the write) was used in the test.

This is clearly wrong.

Meaning, even if the user had a non-overflowing result (ix=ix*1), the compiler would have incorrectly use the (now corrupted) registered value of the product in the case of the intervening write.

Jim Dempsey

Steve_Lionel · ‎06-18-2024

I didn't say it was a case of aliasing - I was giving another example of where the compiler makes assumptions that may not be valid in nonconforming programs. I have not studied this particular case.