Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28863 Discussions

ifx 2023.2.0: integer lowering leads to strange bugs

foxtran
New Contributor I
1,469 Views

Merry Christmas everyone!

In some codebase, these are a lot of integer-kind type transformations between integer(2), integer(4) and integer(8). So, the code like:

 

integer(8) :: ARR(3,N)
integer(2) :: ii, jj, kk
do i = 1,N
  ii = ARR(1,i)
  jj = ARR(2,j)
  kk = ARR(3,j)

 

can be found. Here, integer(8) is transformed to integer(2).

Sometimes, there is type-expansion (integer(4) -> integer(8), compiled with `-i8` flag, ILP64 mode of MKL is used): 

 

integer*4 err
integer n,erri
real*8 mat(n,n),scr(*)
call dsyev('V','L',n,mat,n,scr,scr(n+1),3*n**2,err)
erri=err

 

Here, integer(4) type is passed instead of integer(8).
 
With ifort and GCC, all there examples works perfectly.

Unfortunately, the second example, with some modifications generates runtime error at line 5 since it cannot properly read address of err.

In real application, I have the following diff to fix this issue:

 

@@ -2041,7 +2041,7 @@ C
 ************************************************************************
       implicit none
       integer n,iout,i,j,erri,iflag,ii,info8
-      integer*4 err,info4
+      integer info4
       real*8 mat(n,n),scr(*),ss,tol,mr
       equivalence(info8,info4)
       integer imem,imem1,maxcor,memfree,memreq
@@ -2146,11 +2144,10 @@ c        memreq = idnint(mr)
 c       write(6,*) (mat(i,i),i=1,n)
 c       write(6,"(9f10.6)") mat
 C Diagonalize
-        call dsyev('V','L',n,mat,n,scr,scr(n+1),3*n**2,err)
+        call dsyev('V','L',n,mat,n,scr,scr(n+1),3*n**2,erri)
 c       write(6,*) 'eig'
 c       write(6,"(10000f20.12)") (scr(j),j=1,n)
-        erri=err
-         if(err.ne.0) then
+         if(erri.ne.0) then
           write(iout,*) 'Inverse square root: Fatal error at the ' //
      $'diagonalization of the matrix!'

 

Here, I just removed err (integer*4) and replace it with erri. Then, it started to work (again, the code is compiled with `-i8`).

So, can one validate lowering of integer types between Intel Fortran front-end and LLVM middle-end to make it consistent with old compiler?

Unfortunately, I could not produce a small reproducible example

I used ifx 2023.2.0

Labels (2)
0 Kudos
1 Solution
mecej4
Honored Contributor III
1,437 Views

Here is what is going wrong.

The -i8 compiler option changes the default integer to 8-byte. This option does not change (promote or demote) integer variables declared with an explicit size (such as integer*2). Thus, when you call a Lapack routine such as dsyev with an implicit interface, you may be passing some integer arguments as 8 byte integers, some with 4 byte integers, etc. The MKL ILP-64 routines, however, expect all integer arguments to be 8-byte integers.

To solve the problem, you have these options for using MKL-ILP64:

  • all integer arguments being passed to MKL routines are declared as default integers, if you wish to use -i8
  • all integer arguments to MKL routines are declared explicitly as 8-byte integers, whether or not you use -i8

When your code has errors of this nature, the resulting behavior is compiler-dependent. It may, occasionally, work despite the errors and give you the expected results. You should not conclude that one compiler is right and the other is wrong. Any request to change a compiler to make it behave similarly to another compiler when given incorrect code is a request that will probably be ignored.

Here is a short program that illustrates the points that I made.

program buggy
   integer i
   integer*2 i2
   integer*4 i4
   integer*8 i8
   i = 32767
   i2 = i
   i4 = i2
   i8 = i4
   call sub(i,i2,i4,i8)
   print *,i,i2,i4,i8
end program

subroutine sub(i1, i2, i3, i4)
  i1 = 2*i1
  i2 = 2*i2
  i3 = 2*i3
  i4 = 2*i4
  return
end subroutine

Compare these results:

ifort65534-26553465534
ifort -i865534-26553465534
ifx65534327676553465534
ifx -i865534327676553465534

View solution in original post

8 Replies
mecej4
Honored Contributor III
1,438 Views

Here is what is going wrong.

The -i8 compiler option changes the default integer to 8-byte. This option does not change (promote or demote) integer variables declared with an explicit size (such as integer*2). Thus, when you call a Lapack routine such as dsyev with an implicit interface, you may be passing some integer arguments as 8 byte integers, some with 4 byte integers, etc. The MKL ILP-64 routines, however, expect all integer arguments to be 8-byte integers.

To solve the problem, you have these options for using MKL-ILP64:

  • all integer arguments being passed to MKL routines are declared as default integers, if you wish to use -i8
  • all integer arguments to MKL routines are declared explicitly as 8-byte integers, whether or not you use -i8

When your code has errors of this nature, the resulting behavior is compiler-dependent. It may, occasionally, work despite the errors and give you the expected results. You should not conclude that one compiler is right and the other is wrong. Any request to change a compiler to make it behave similarly to another compiler when given incorrect code is a request that will probably be ignored.

Here is a short program that illustrates the points that I made.

program buggy
   integer i
   integer*2 i2
   integer*4 i4
   integer*8 i8
   i = 32767
   i2 = i
   i4 = i2
   i8 = i4
   call sub(i,i2,i4,i8)
   print *,i,i2,i4,i8
end program

subroutine sub(i1, i2, i3, i4)
  i1 = 2*i1
  i2 = 2*i2
  i3 = 2*i3
  i4 = 2*i4
  return
end subroutine

Compare these results:

ifort65534-26553465534
ifort -i865534-26553465534
ifx65534327676553465534
ifx -i865534327676553465534
JohnNichols
Valued Contributor III
1,386 Views
program buggy
    implicit none
   integer*2 i
   integer*2 i2
   integer*4 i4
   integer*8 i8
   i = 32767
   i2 = i
   i4 = i2
   i8 = i4
   call sub(i,i2,i4,i8)
   print *,3,i,i2,i4,i8
end program

subroutine sub(i1, i2, i3, i4)
implicit none

integer*2 i1
integer*2 i2
integer*4 i3
integer*8 i4

   print *,1,i1,i2,i3,i4
  i1 = 2*i1
  i2 = 2*i2  
   print *,2,i1,i2,i3,i4
  i3 = 2*i3
  i4 = 2*i4
  return
end subroutine

A slightly different buggy,  but in the end the reality is to always consider the Set of actual integers that are in i2, i4, i8 when doing the math, the complier does not care if you make a mistake, it does not make a mistake, it follows the rules built in, do not assume as @mecej4 shows that the rules are the same from EXCEL to R to any compiler.  They are not.  Do not assume that the people who coded the stuff actually care about all of the real math rules.   Of course, if we were still in the age of 640k, one could consider using i2 instead of i8, but now no. 

 

Screenshot 2023-12-26 093428.png

JohnNichols
Valued Contributor III
1,363 Views

@mecej4 

 

Setting aside all the Fortran issues, is the -2 an artifact of the strange symmetry of the integer number line, where there are 1 and -1 etc, but only one - "zero."  The binary numbers are not split evenly, so you are mashing a permanently odd numbered set into an even space. 

  

 

I could be wrong, but I know you will know. 

John

0 Kudos
mecej4
Honored Contributor III
1,356 Views

In binary integer arithmetic, the convention is to fuse +0 and -0 to '0'. In floating point arithmetic, however, IEEE-754 mandates a distinction between +0.0 and -0.0. See this Wikipedia article.

When you set y = 1.0/x, with x and y real, if you don't distinguish between x = +0.0 and x = -0.0, you are confronted with having to accept that +∞ and -∞ should be the same.

0 Kudos
JohnNichols
Valued Contributor III
1,317 Views

Using a short example 

000001
000112
001023
001134
010045
010156
011067
011178
100009
1001-110
1010-211
1011-312
1100-413
1101-514
1110-615
1111-716

 

The ninth number row is your wonky zero.  

Screenshot 2023-12-27 093655.png

Stolen from the Wikipedia IEEE 754 standard site,  31 here is the problem bit

There is nothing wrong with buggy 1 or buggy 1.0001  --  the -2 should as far as I can see according to the IEEE rules signal an overflow, imagine if this is a rocketry program and someone made this mistake, we do not want -2, it tells us nothing other than the number 9 exists in binary but not in reality, and we want to tell the programmer -- you have a mistake,  -2 does not signal mistake.  

I have not read the full standard, but the Wikipedia IEEE 754 standard site, would seem to indicate that overflow is the correct and required answer.  

Your thoughts appreciated to show me the error of my thoughts. 

 

 

0 Kudos
JohnNichols
Valued Contributor III
1,237 Views

Thinking about this little matter,  the original buggy gives two warnings on compile as follows

Screenshot 2024-01-02 092050.png

If we consider the set of I2 , I4 and I8, then it is not closed under addition or multiplication. So particularly the I8 conversion to an I4 implicitly should trigger an error,  the chances that one is outside the set of numbers in I2, I4, I8 that can be added or multiplied is in the multiplied sense only 1/n for each element of the set.  It is a reasonable risk for I8, perhaps for I4, but not for I2. 

Example results from buggy:

Screenshot 2024-01-02 092032.png

The compiler should be getting passed the stage of assuming that the programmer even notices these things in a hectic day.  

I realize we should use implicit none, but there is a lot of old code still in use and not all have the skills of this august group, myself excluded of course.  

 

0 Kudos
foxtran
New Contributor I
1,189 Views

Happy New Year everyone!

After some experiments, I noticed that old ifort and GCC allocates some additional space on stack before memory call so this problem does not arise. At the same time, modern LLVM assumes that all routines are called properly and, therefore, it does not allocate extra stack memory. As a result, improper last argument in dsyev call leads to stack corruption when code was compiled with ifx, but not with ifort.

@mecej4 and @JohnNichols, you are right that the code is wrong. 

@mecej4, note, in your example, all variables are passed via registers, while my problem arises when arguments are passing via stack.

@Barbara_P_Intel , is it possible to adjust stack allocation in ifx for such cases to avoid stack corruption?


0 Kudos
Barbara_P_Intel
Employee
1,173 Views

>> is it possible to adjust stack allocation in ifx for such cases to avoid stack corruption?

There's a couple of solutions to try. One is to put the arrays on the heap. The other is to increase the stack size. See this reference in the Fortran DGR (Developer Guide and Reference). Both solutions are in that reference.

 

0 Kudos
Reply