Solved: Transpose causes stack overflow on allocated pointer

eos_pengwern · ‎06-02-2012

I've noticed that

[bash]A = transpose(B)[/bash]causes a stack overflow if A and B are defined via

[bash]real(kind(1d0)), pointer, dimension(:,:) :: A, B allocate(A(700,700), B(700,700))[/bash]
and yet not if A and B are defined as allocatable arrays of the same size.

Is there a reason for this,or is it just a quirk of the way that the transpose command is implemented by the Intel compiler (I'm using 12.1.4.325)?

In my own application, it would be slightly more convenient for Aand B to be pointers, so I'd have theoption either of allocatingthem explicitly or of getting them to point to something already allocated somewhere else. Unfortunately this behaviour rules that out, so I've been forced toworkaround it.

IanH · ‎06-02-2012

Further, note when A and B do not have the pointer attribute the compiler knows that the storage areas for the two variables are distinct - they cannot overlap.

If the variables are pointers (or one is a pointer and the other has the target attribute) then it is very possible that the storage areas overlap (the variables are aliased in some way). The code generated by the compiler needs to be defensive to this in the general case. Creating a temporary copy is one way of doing this - another option is to test at runtime for overlap and then decide dynamically whether a temporary is required. For small arrays the execution penalty of the test might exceed the penalty of the temporary copy.

In simple specific cases the compiler may be able to follow the logic of the whole program and determine statically that the storage areas never overlap, but that sort of static analysis is not straightforward unless the code is trivial.

If the programmer knows that the storage areas will always be distinct, then they can code appropriately. For example, they could pass the arrays to a worker routine that takes them as non-pointer dummy arrays and then do the assignment in that worker routine.

View solution in original post

mecej4 · ‎06-02-2012

In many situations where the right hand side of a Fortran assignment statement is an array expression, a temporary array may need to be allocated on the stack. If stack space is insufficient, the program may fail. The "/heap-arrays:n" compiler option may alleviate the problem somewhat.

TimP · ‎06-02-2012

In general, intrinsics such as transpose and matmul must allocate a temporary matrix to receive the result. Certain compilers suppress the temporary when there is assignment directly to a suitable array, but ifort does not. ifort does perform optimizations to elide transpose in a context such as matmul(a,tranpose(a)).
As mecej4 mentioned, /heap-arrays will avoid using stack space. The :n option applies only where the size of the allocation is known at compile time.

IanH · ‎06-02-2012

Further, note when A and B do not have the pointer attribute the compiler knows that the storage areas for the two variables are distinct - they cannot overlap.

If the variables are pointers (or one is a pointer and the other has the target attribute) then it is very possible that the storage areas overlap (the variables are aliased in some way). The code generated by the compiler needs to be defensive to this in the general case. Creating a temporary copy is one way of doing this - another option is to test at runtime for overlap and then decide dynamically whether a temporary is required. For small arrays the execution penalty of the test might exceed the penalty of the temporary copy.

In simple specific cases the compiler may be able to follow the logic of the whole program and determine statically that the storage areas never overlap, but that sort of static analysis is not straightforward unless the code is trivial.

If the programmer knows that the storage areas will always be distinct, then they can code appropriately. For example, they could pass the arrays to a worker routine that takes them as non-pointer dummy arrays and then do the assignment in that worker routine.

eos_pengwern · ‎06-04-2012

That sounds like the solution; allocatable arrays are obviously distinct from one another, whereas pointer arrays may not be. In my actual application thereis another degree of indirection, since the pointer-defined arrays were components of derived types which were themselves addressed via pointers. The compiler is obviously able to see through this latter level, but plays safe when it encounters the arrays themselves.