Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29263 Discussions

BUG: c_f_pointer and hidden copies on stack

jhgoebbert
Beginner
1,055 Views
Hello,

We came across a strange behaviour of intel 12.1 (which cannot be seen with gfortran)
while using c_f_pointer().

The program below gives a segmentation fault, if "n" ist large (eg. 512).
because an unnessesary user-hidden copy of the array is made on the stack in the line "var2=var1".
If "n" is small (eg. 4) no segmantation fault happens, because the stack size is suffcient.

c_f_pointer always returns a coherent memory-segment.
Therefore the compiler can be sure, that the returned fortran pointer is also coherent.
Doing a user-hidden copy is not nesessary and should be avoided.

The segmentation fault can be avoided using the compiler-flag "-heap-arrays".
But a copy is also time and memory consuming and therefore slows down the code and increased the memory footprint. Specially performace critical codes are affected by these hidden copies.
So, "-heap-arrays" is no solution.

I this a bug? Can it be fixed?

Regards,
Jens Henrik

program cfpointer
use,intrinsic :: iso_c_binding
implicit none
integer,parameter :: wp=8
integer :: n=512
real(wp), allocatable, target :: mem(:,:,:,:)
real(wp), pointer :: var1(:,:,:),var2(:,:,:)
type(c_ptr), target :: ptr

allocate(mem(n,n,n,2))

ptr=c_loc(mem(1,1,1,1))
call c_f_pointer(ptr,var1,[n,n,n])
ptr=c_loc(mem(1,1,1,2))
call c_f_pointer(ptr,var2,[n,n,n])

write(*,*) 'A'
var1=5.0_wp
write(*,*) 'B'
var2=0.0d0
write(*,*) 'C'
var2=var1 ! segmentation fault
write(*,*) 'D'

end program cfpointer
0 Kudos
8 Replies
TimP
Honored Contributor III
1,055 Views
Evidently, the compiler is concerned that var1 and var2 may overlap, with unknown alignment. It must decide at compile time whether to allow for this, as it doesn't analyze the specific values you supply in allocate vs. size of arrays.
If interested, you might try variations, such as declaring SEQUENCE, and using !dir$ IVDEP, to see if you can eliminate the temporary and copies.
I attempted to run under Amplifier-XE in hopes of seeing which version is actually run when copying, and I reported the resulting failure of Amplifier.
0 Kudos
jhgoebbert
Beginner
1,055 Views
Hi Tim,

thanks for your answer.

We understand your argument, that the compiler is concerned that var1 and var2 may overlap.
This is really bad news :(
Fortran assumes generaly, that memory must not overlap - so shouldn't intel compiler assume this by default?

If we simplify the problem this way, fortran pointer seem to be the problem in general:
program test
implicit none
real(kind=8),dimension(:,:,:),pointer :: a,b
allocate(a(1024,1024,1024),b(1024,1024,1024))
a=1.0
b=2.0
b=a ! segmentation fault
end program test

If you replace "pointer" by "allocatable" everything works fine.
Again Intel creates a hidden copy on the stack :(

Regards,
Jens Henrik







0 Kudos
jhgoebbert
Beginner
1,055 Views
Hi,

to simplify the program even more, do this:

program test
implicit none
real(kind=8),dimension(:,:,:),pointer :: a,b
real(kind=8),dimension(:,:,:),allocatable :: mem_a, mem_b
allocate(mem_a(1024,1024,1024))
allocate(mem_b(1024,1024,1024))
a=1.0
b=2.0
b=a ! segmentation fault
end program test

For the this program the compiler should be able to decide for non-overlay at compile-time.
Or at least at run-time. But it does not.

Our CFD code is doing nothing else, than working on large arrays, which are passed from function to function using pointers. All calculation time goes into constructs like ptr1=ptr2*ptr3 and similar.

The more I think about it, the more I come to the conclusion, that this is a critical bug - specially for high-performance codes. The intel compiler must have a flag to turn off these hidden copies.

some more information I found here http://software.intel.com/en-us/forums/showthread.php?t=85104
and a nice example, with the same problem: http://fftw.org/doc/Allocating-aligned-memory-in-Fortran.html#Allocating-aligned-memory-in-Fortran

Regards,
Jens Henrik
0 Kudos
TimP
Honored Contributor III
1,055 Views
In general, those pointers to 2 places in a single allocatable array could overlap in any way you choose.
In your new example, it does appear that the compiler should recognize separately allocated arrays as not overlapping, but the code assigns values to pointers without association, so it seems the first bug would be in not reporting that. Not that I care to pose as an expert in such things.
I have cases of long standing where ifort makes unnecessary temporaries but gfortran does not, making the use of f90 array assignments uncompetitive. Slow but significant progress has been made over the years. I was informed months ago there would be no more work on that issue, yet there has been additional progress since.
0 Kudos
Anonymous66
Valued Contributor I
1,055 Views
The compiler catches assigning values to nonassociated pointers if the option "-check pointers" or "-check all is used"
0 Kudos
Steven_L_Intel1
Employee
1,055 Views
In general, the language allows overlap between any two POINTER variables. We have seen several cases where gfortran did not consider this and delivered wrong results because it did not make a temp.
0 Kudos
jhgoebbert
Beginner
1,055 Views
Hi Tim, Hi Steve,

I am happy to hear, that Intel is aware of this unnecessary temporaries.
We figured, that the use of pointers in high performance codes is not a good idea if compiled with ifort.

If we have the pointers a,b,c to 3d arrays of the shape array(1024,1024,1024) and do the calculation:
a=a+b**2+c**2
this is _much slower_ (because of unnesseary temporaries) than using allocatables instead of pointers.

We were not aware of that.
Can we fix this with some more moden Fortran technics ?

So we have to change code pieces to something like this: (?)
-----------
do k=lbound(a,3), ubound(a,3)
do j=lbound(a,2), ubound(a,2)
do i=lbound(a,1), ubound(a,1)
a(i,j,k) = a(i,j,k) + b(i,j,k)**2 + c(i,j,k)**2
end do
end do
end do
-----------
Would this avoid temp arrays?

I found this thread in http://compgroups.net/comp.lang.fortran/c_f_pointer-aliasing-and-performance/598845
where you Steve also participated.

From there the easiest solutions seems to me to do what is posted as:
"Pass the data to a subroutine, and "forget" to tell the subroutine
that the data is a TARGET. Do all your computations in the subroutine."

Does that work?

Another solution seems to be cray pointers and the compile option "safe_cray_ptr".
But this makes the code non-portable.

IBMs xlf has a compiler option called "-qalias=noaryovrlp".
(http://publib.boulder.ibm.com/infocenter/comphelp/v101v121/index.jsp?topic=/com.ibm.xlf121.aix.doc/compiler_ref/alias.html)
I would like to see something simular for the intel fortran compiler.

Regards,
Jens Henrik
0 Kudos
jhgoebbert
Beginner
1,055 Views
Hi Annalee,

the pointers in the example are associated (in the background) while allocating.

Regards,
Jens Henrik
0 Kudos
Reply