Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Bug report: segmentation fault

maidantal
Beginner
504 Views
The code below demonstrates segmentation fault runtime error
file tfor.for:
----------------------------------------------------------------------------------
integer n,ierr
parameter (n=1256)
c segmentation fault for pointers with n>1255
double precision,pointer :: a(:,:),a2(:,:)
c no errors for allocatable arrays
c double precision,allocatable :: a(:,:),a2(:,:)
allocate (a(1:n,1:n),a2(1:n,1:n),stat=ierr)
a=1.
write(*,*) ierr
a2=matmul(a,a)
end
-------------------------------------------------------------------------------------------
Compiler and program output:
-------------------------------------------------------------------------------------------
[root@localhost axoscdist]# ifort -O0 -debug all tbug.for
[root@localhost axoscdist]# ./a.out
0
Segmentation fault
------------------------------------------------------------------------------------
System information:
---------------------------------------------------------------------------------------------------------
[root@localhost axoscdist]# uname -a
Linux localhost.localdomain 2.6.24.5-85.fc8 #1 SMP Sat Apr 19 11:18:09 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost axoscdist]# rpm -qa | grep glibc
glibc-devel-2.7-2
glibc-common-2.7-2
glibc-devel-2.7-2
glibc-2.7-2
glibc-headers-2.7-2
glibc-2.7-2
------------------------------------------------------------------------------------------------------------
Compiler information:
-------------------------------------------------------------------------------------------------------------------------
[root@localhost axoscdist]# ifort -V
Intel Fortran Compiler for applications running on Intel 64, Version 10.1 Build 20080312 Package ID: l_fc_p_10.1.015
-------------------------------------------------------------------------------------------------------------------------------------
I apologize if this forum is a wrong place for this post. If it is so, may be anybody can recommend me a proper way for bug report submission.
0 Kudos
6 Replies
TimP
Honored Contributor III
504 Views
Did you check for stack overflow, either by running under debugger, by adjusting your stack, or using the heap-arrays option? If matmul is aggravating the problem (in your real application) by allocating its own array, which is filled and then copied to a2, you could file a feature request asking if it is possible to recognize cases where the matmul "temporary" could be avoided (not by recognizing unused code, such as your example).
It looks like my copy of ifort optimizes away the explicit and implicit allocate, so it's not strictly correct to call the extra array for the matmul result a temporary. Your example ends up spending a long time copying data, making 2 copies of your big identity matrix.
0 Kudos
Bonnie_A_Intel
Employee
504 Views
Regarding your concern about where to submit your technical questions, we welcome all technical questions regarding the Intel Fortran Compilers for Linux* and Mac OS* X on this Forum.
0 Kudos
maidantal
Beginner
504 Views
The heap-arrays option indeed resolves the problem, thus it is a stack overflow. But why it doesn't appear even withou heap array if allocateble arrays a and a2 are used in place of pointers? Regaring optimization, I don't believe that the time spent to copying data is important in this case, since it involves n**2 operations, much less then n**3 operations for the matrix multiplication.
0 Kudos
TimP
Honored Contributor III
504 Views
Perhaps you found a variant where the compiler didn't see that the allocation could be optimized away at compile time, and that had the side effect of eliminating the need for -heap-arrays.
You would have to test performance of your intended usage to determine whether allocation and copying of an additional array would matter in your performance. I agree that the additional cache misses cannot be worse than order n**2, but those are more expensive than the order n**3 vectorized in-cache operations.
I brought it up because it nearly doubled the demand for stack space in your example.
In my own tests, with quite different parameters from yours, the additional allocation and copy alone cost about 20% additional run time, largely due to cache misses. I compared performance of matmul in ifort and gfortran against Intel MKL, including the gfortran option to substitute the BLAS function for matmul, which tests MKL with the addition of the temporary.
0 Kudos
Steven_L_Intel1
Employee
504 Views
The compiler knows, because of Fortran language rules, that ALLOCATABLE arrays are contiguous and don't overlap. It doesn't know that about POINTER arrays, and thus creates the temporary result. To do otherwise would require generating two different code paths and doing a run-time test for overlap. Feasible, but not currently implemented and of dubious benefit.

My advice is to use ALLOCATABLE unless you absolutely need the additional features of POINTER.
0 Kudos
TimP
Honored Contributor III
504 Views
Good explanation Steve, thanks. So unnecessary use of POINTER can easily lead to stack overflow or significant time spent copying heap.
0 Kudos
Reply