Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29376 Discussions

Compiler bug (?), code works on ifort 11.1 and 12.0, broken on 12.1.3 and 13.1.0

Lev
Beginner
1,082 Views

Hi all, 

The following trivial program works fine with gfortran, ifort 11.1 20100806 and ifort 12.1.0 20110811, but crashes with segmentation fault if compiled with ifort 12.1.3 20120212 or ifort 13.1.0 20130121:

[fortran] program bad implicit none type x integer(kind=4), dimension(128) :: a end type x integer, parameter :: s = 126 integer :: istat type(x),dimension(:,:,:),allocatable :: N allocate(N(0:s+1,0:s+1,0:s+1),stat=istat) if(istat /= 0) then print *, "allocation failed" stop end if N(0:s+1,0:s+1, 0) = N(0:s+1,0:s+1,s) N(0:s+1,0:s+1,s+1) = N(0:s+1,0:s+1,1) deallocate(N) end program bad [/fortran]

If the internal array size in the type definition is changed to anything less than 128 elements, everything works fine. Looks like a magic number of 2^30 (128*128*128*128*4) is somehow involved in this. Please advise what to do. Yes, I can re-write the corresponding loops manually and then the code starts to work, but that completely misses the point...

0 Kudos
6 Replies
Steven_L_Intel1
Employee
1,082 Views

Try compiling with -heap-arrays. One or both assignments are creating temporary copies. I see the same behavior in 12.1.

I have asked the developers to see if the compiler can be smarter about this and filed issue DPD200241574.

0 Kudos
Lev
Beginner
1,082 Views

You are right, apparently it is the stack system limit size and not the compiler version was the cause of fail/pass behaviour (older compilers were installed on another system with no stack limit, new on a system with 8Mb stack limit as default). Resetting stack limit to unlimited with ulimit -s unlimited also mitigates the problem. But why are temporary arrays allocated in this case anyway? There is no intersection between array slices involved, so this should be compiled without recourse to tempoarary arrays, no? And what is the size of the temporary arrays involved? In our production cases N array may take more than 2/3 of available system RAM.

0 Kudos
Steven_L_Intel1
Employee
1,082 Views

I agree that no temporary is required - but the compiler may need additional analysis to determine that. I have asked the developers to add this. The temporary would be the size of the array slice, which could be very large. There is also the added time in copying the data.

In the case you show, no temporary is required. In other cases, if the compiler cannot determine that there is no overlap, it will construct a temporary.

0 Kudos
Lev
Beginner
1,082 Views

I experimented with this a bit more and found that it is the custom type that causes inefficient code to be generated. This is the minimal test case to demonstrate the problem:

[fortran] program test implicit none type x integer :: a end type integer, parameter :: s = 512 type(x), dimension(s,s,s) :: bad integer, dimension(s,s,s) :: good bad(:,:,1) = bad(:,:,2) good(:,:,1) = good(:,:,2) end program test [/fortran]

If you check the assembler output, "bad" array is copied using a temporary buffer, "good" is copied in place, as expected.

0 Kudos
Steven_L_Intel1
Employee
1,082 Views

I didn't mean to suggest that the code was "improved" with -heap-arrays, only that you'd avoid the segfault. The developers are looking at the case now - you're right that the use of the derived type is important.

0 Kudos
Steven_L_Intel1
Employee
1,082 Views

We have improved the compiler's overlap detection to properly handle this case. The change will appear in a compiler version later this year.

0 Kudos
Reply