Solved: Does the structure in your

Alexandre_P_ · ‎05-05-2015

I think i'm observing a bug in ifort.

$> ifort test.f90 -O1 -g && ./a.out
    6 0 0 0 0 0 0
    1 0
$> ifort test.f90 -O0 -g && ./a.out
    6 0 0 0 0 0 0
    6 0 0 0 0 0 0

The second result is the good one, and I see no reason for the difference.

file test.f90 :

   module useless_module
    ! this module is useless
    ! remove it and the bug disappear
      implicit none
    ! those variables are useless
    ! they will never be touched
    ! remove one of them and the bug disappear
    ! rename one of them and the bug disappear
      integer,allocatable,dimension(:) :: num_dr , &
                                          num_cf , &
                                          num_cfi, &
                                          num_num, &
                                          num_typ
    end module useless_module
    
    program test_program
      implicit none
    ! those variables are useless
    ! they will never be touched
    ! remove one of them and the bug disappear
      integer,allocatable,dimension(:) :: a1, b1, c1, d1, &
                                          e1, g1, f1, h1, &
                                          i1, j1, k1
    
      call routine_1(a1,b1,c1,d1,e1,f1,g1,h1,i1,j1,k1)
    contains
    
      subroutine routine_1(a3,b3,c3,d3,e3,f3,num_cf,num_dr, &
                           num_typ,num_num,num_cfi)
        implicit none
    ! those arguments are useless
    ! they will never be touched
    ! remove one of them and the bug disappear
          integer,allocatable,dimension(:)     :: a3,b3,c3,d3,e3,f3
    ! those arguments are useless
    ! they will never be touched
    ! remove one of them and the bug disappear
    ! rename one of them and the bug disappear
          integer,allocatable,dimension(:)     :: num_dr , &
                                                  num_cf , &
                                                  num_cfi, &
                                                  num_num, &
                                                  num_typ
    ! this variable is useless
    ! it will never be touched
    ! remove it and the bug disappear
          integer,allocatable,dimension(:)     :: g3
    ! those variables are actualy used !
          integer,allocatable,dimension(:,:,:) :: h3,i3,j3
    
          allocate(h3(1,1,1),i3(1,1,1),j3(1,1,1))
    
          call routine_2(g3,i3,j3)
    
    ! here, normaly, size(i3)=6 and i3= 0 0 0 0 0 0
    ! But that is not what is printed : BUG ?
    ! printing size(i3) AND i3 is mandatory to make the bug happen
          write(*,'(7i2)') size(i3),i3
          deallocate(h3,i3,j3)
      end subroutine routine_1
    
      subroutine routine_2(a2,b2,c2)
        use useless_module
        implicit none
          integer,allocatable,dimension(:)     :: a2,d2,e2,f2,g2
          integer,allocatable,dimension(:,:,:) :: b2, c2
          integer                              :: j2
    
    ! j2 have to be be a variable
          j2=1
    ! allocate and deallocate some array
    ! not doing that will make the bug desappear
          allocate  (d2(j2),e2(1),f2(1),g2(1))
          deallocate(d2   ,e2    ,f2   ,g2)
    
          call reallocate(  c2,3,2,1)
          call reallocate(  b2,3,2,1) ;   b2=0
    
    ! here, we have size(b2)=6 and b2= 0 0 0 0 0 0
    ! printing size(b2) AND b2 is mandatory to make the bug happen
          write(*,'(7i2)') size(b2),b2
      end subroutine routine_2
    
      subroutine reallocate(a4,b4,c4,d4)
        implicit none
        integer,allocatable,dimension(:,:,:) :: a4
        integer                              :: b4,c4,d4
    
        deallocate(a4) ; allocate(a4(b4,c4,d4))
        
      end subroutine reallocate
    
    end program test_program

As you can see, I'm doing nothin fancy.
I tried to reduce the code a much as i could
I tried on three computer under linux (ubuntu and archlinux) with three version of ifort (15.0.0 20140723, 15.0.2 20150121 and 14.0.0 20130728)
I always see the same thing.

I don't see it with gfortran (4.8.2 or 5.1.0)

It seems big, and I'm sure I'm making a mistake, but I don't see it.

Any help will be appreciated

remark : I have also posted this question here.

Steven_L_Intel1 · ‎05-13-2015

I expect the fix for this to be in the 16.0 release.

View solution in original post

Steven_L_Intel1 · ‎05-05-2015

Yes, this is indeed a bug in ifort. Escalated as issue DPD200369981. Thanks for the example and for pointing out what can make the problem disappear. I will let you know what we find.

Alexandre_P_ · ‎05-05-2015

Thank you,

I'm looking forward for a quick fix, because while the bug is very easy to workaround on the test case,
on my real life code, I haven't found a workaround yet.

Steven_L_Intel1 · ‎05-05-2015

Does not setting -g avoid the problem for you? (I understand that it might introduce other issues!) I doubt there will be a "quick" fix - I am hoping that it will get fixed for the 16.0 release, but these things sometimes take time.

Alexandre_P_ · ‎05-05-2015

Does not setting -g avoid the problem for you?

Unfortunately no, in fact i tried everything i could think of to avoid the issue

play with compile flags (with and without each one of those :-O2 -O3 -g -traceback -static -xHost)
use "only" keyword on all "use" statement in order to limit scope issues
put the content of the program in a separate subroutine, again to avoid scope issues

But without success yet, I can continue to blindly modify random things in my code (9 000 lines), but it may be long.

I doubt there will be a "quick" fix - I am hoping that it will get fixed for the 16.0 release, but these things sometimes take time.

I understand that, in fact what i meant was that I'm hoping for help on how to workaround this.

The fact is that i got time on a supercomputer to benchmark my code on many cores, and it will be very sad if I'm limited to -O0.

FortranFan · ‎05-05-2015

Alexandre P. wrote:

Does not setting -g avoid the problem for you?

Unfortunately no, in fact i tried everything i could think of to avoid the issue

play with compile flags (with and without each one of those :-O2 -O3 -g -traceback -static -xHost)

use "only" keyword on all "use" statement in order to limit scope issues

put the content of the program in a separate subroutine, again to avoid scope issues

But without success yet, I can continue to blindly modify random things in my code (9 000 lines), but it may be long.

I doubt there will be a "quick" fix - I am hoping that it will get fixed for the 16.0 release, but these things sometimes take time.

I understand that, in fact what i meant was that I'm hoping for help on how to workaround this.

The fact is that i got time on a supercomputer to benchmark my code on many cores, and it will be very sad if I'm limited to -O0.

Does the structure in your reproducer and the name(s) and comments therein really reflect your actual code? If yes, what happens if you remove your "useless_module" and any references to it? Also, what happens if you replace "contained" procedures from your main program with module procedures? If you really have "useless" stuff in your code, perhaps your best bet will be to "clean up" code and retry, especially since you're making a significant investment already to procure supercomputer time.

Alexandre_P_ · ‎05-05-2015

Does the structure in your reproducer and the name(s) and comments therein really reflect your actual code?

Not really, as I said, I tried to make the test case as small and clear as possible.
Posting my 9000 lines length code seemed a bad idea.

If yes, what happens if you remove your "useless_module" and any references to it?

If you really have "useless" stuff in your code, perhaps your best bet will be to "clean up" code and retry, especially since you're making a significant investment already to procure supercomputer time.

In the actual code there is no useless things (at least not i'm aware of). Things became useless when I tried to reduce to the smalest test case

Also, what happens if you replace "contained" procedures from your main program with module procedures?

That's what I was trying before ending my day of work. I will tell you tomorrow.

FortranFan · ‎05-05-2015

FWIW, the code in the original post works as expected on Windows with Intel Fortran 15.0, update 2 as well as 16.0 beta.

FortranFan · ‎05-05-2015

Alexandre P. wrote:

..

In the actual code there is no useless things (at least not i'm aware of). Things became useless when I tried to reduce to the smalest test case

..

Well, in that case you can' t be sure the posted code is indeed a true reproducer of the problem in your actual code.

You may want to look into using Intel Premier Support with its confidential facility for resolution with your actual code. Or try 16.0 beta or if possible, try the Windows compiler.

TimP · ‎05-05-2015

If you have 9000 lines of source code, you should consider building parts of it with various levels of optimization.

Alexandre_P_ · ‎05-05-2015

Well, in that case you can' t be sure the posted code is indeed a true reproducer of the problem in your actual code.

No you are right, I know only two way to solve this kind of problem : reducing it, and isolating it. Right know i haven't succeded in isolating it in my code finding a workaround.
Anyway, this test code seems to reveal a bug, and this bug might be the solution of my problem.

You may want to look into using Intel Premier Support with its confidential facility for resolution with your actual code. Or try 16.0 beta or if possible, try the Windows compiler.

I will consider the intel Premier Support, but I can't try on window. I will try to test the 16.0 beta.

If you have 9000 lines of source code, you should consider building parts of it with various levels of optimization.

That's a good idea, I'll keep it in mind

Steven_L_Intel1 · ‎05-05-2015

FortranFan wrote:

FWIW, the code in the original post works as expected on Windows with Intel Fortran 15.0, update 2 as well as 16.0 beta.

Indeed, I discovered this when I was trying it out. I am fairly certain that it is a bug in an optimization that occurs quite early in the process. It is also probably highly dependent on memory layout.

Steven_L_Intel1 · ‎05-13-2015

I expect the fix for this to be in the 16.0 release.

Alexandre_P_ · ‎05-13-2015

Ok thanks, I'm looking forward for it

Steven_L_Intel1 · ‎06-12-2015

I am now told that this will also be fixed in 2015 Update 5 in August.

Alexandre_P_ · ‎06-12-2015

OK, thanks

I will check this out.

Is this a bug in ifort ?