- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The code below demonstrates segmentation fault runtime error
file tfor.for:
parameter (n=1256)
c segmentation fault for pointers with n>1255
double precision,pointer :: a(:,:),a2(:,:)
c no errors for allocatable arrays
c double precision,allocatable :: a(:,:),a2(:,:)
allocate (a(1:n,1:n),a2(1:n,1:n),stat=ierr)
a=1.
write(*,*) ierr
a2=matmul(a,a)
end
-------------------------------------------------------------------------------------------
Compiler and program output:
-------------------------------------------------------------------------------------------
[root@localhost axoscdist]# ifort -O0 -debug all tbug.for
[root@localhost axoscdist]# ./a.out
0
Segmentation fault
------------------------------------------------------------------------------------
System information:
---------------------------------------------------------------------------------------------------------
[root@localhost axoscdist]# uname -a
Linux localhost.localdomain 2.6.24.5-85.fc8 #1 SMP Sat Apr 19 11:18:09 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost axoscdist]# rpm -qa | grep glibc
glibc-devel-2.7-2
glibc-common-2.7-2
glibc-devel-2.7-2
glibc-2.7-2
glibc-headers-2.7-2
glibc-2.7-2
------------------------------------------------------------------------------------------------------------
Compiler information:
-------------------------------------------------------------------------------------------------------------------------
[root@localhost axoscdist]# ifort -V
Intel Fortran Compiler for applications running on Intel 64, Version 10.1 Build 20080312 Package ID: l_fc_p_10.1.015
-------------------------------------------------------------------------------------------------------------------------------------
I apologize if this forum is a wrong place for this post. If it is so, may be anybody can recommend me a proper way for bug report submission.
file tfor.for:
----------------------------------------------------------------------------------integer n,ierr
parameter (n=1256)
c segmentation fault for pointers with n>1255
double precision,pointer :: a(:,:),a2(:,:)
c no errors for allocatable arrays
c double precision,allocatable :: a(:,:),a2(:,:)
allocate (a(1:n,1:n),a2(1:n,1:n),stat=ierr)
a=1.
write(*,*) ierr
a2=matmul(a,a)
end
-------------------------------------------------------------------------------------------
Compiler and program output:
-------------------------------------------------------------------------------------------
[root@localhost axoscdist]# ifort -O0 -debug all tbug.for
[root@localhost axoscdist]# ./a.out
0
Segmentation fault
------------------------------------------------------------------------------------
System information:
---------------------------------------------------------------------------------------------------------
[root@localhost axoscdist]# uname -a
Linux localhost.localdomain 2.6.24.5-85.fc8 #1 SMP Sat Apr 19 11:18:09 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost axoscdist]# rpm -qa | grep glibc
glibc-devel-2.7-2
glibc-common-2.7-2
glibc-devel-2.7-2
glibc-2.7-2
glibc-headers-2.7-2
glibc-2.7-2
------------------------------------------------------------------------------------------------------------
Compiler information:
-------------------------------------------------------------------------------------------------------------------------
[root@localhost axoscdist]# ifort -V
Intel Fortran Compiler for applications running on Intel 64, Version 10.1 Build 20080312 Package ID: l_fc_p_10.1.015
-------------------------------------------------------------------------------------------------------------------------------------
I apologize if this forum is a wrong place for this post. If it is so, may be anybody can recommend me a proper way for bug report submission.
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you check for stack overflow, either by running under debugger, by adjusting your stack, or using the heap-arrays option? If matmul is aggravating the problem (in your real application) by allocating its own array, which is filled and then copied to a2, you could file a feature request asking if it is possible to recognize cases where the matmul "temporary" could be avoided (not by recognizing unused code, such as your example).
It looks like my copy of ifort optimizes away the explicit and implicit allocate, so it's not strictly correct to call the extra array for the matmul result a temporary. Your example ends up spending a long time copying data, making 2 copies of your big identity matrix.
It looks like my copy of ifort optimizes away the explicit and implicit allocate, so it's not strictly correct to call the extra array for the matmul result a temporary. Your example ends up spending a long time copying data, making 2 copies of your big identity matrix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Regarding your concern about where to submit your technical questions, we welcome all technical questions regarding the Intel Fortran Compilers for Linux* and Mac OS* X on this Forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The heap-arrays option indeed resolves the problem, thus it is a stack overflow. But why it doesn't appear even withou heap array if allocateble arrays a and a2 are used in place of pointers? Regaring optimization, I don't believe that the time spent to copying data is important in this case, since it involves n**2 operations, much less then n**3 operations for the matrix multiplication.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps you found a variant where the compiler didn't see that the allocation could be optimized away at compile time, and that had the side effect of eliminating the need for -heap-arrays.
You would have to test performance of your intended usage to determine whether allocation and copying of an additional array would matter in your performance. I agree that the additional cache misses cannot be worse than order n**2, but those are more expensive than the order n**3 vectorized in-cache operations.
I brought it up because it nearly doubled the demand for stack space in your example.
In my own tests, with quite different parameters from yours, the additional allocation and copy alone cost about 20% additional run time, largely due to cache misses. I compared performance of matmul in ifort and gfortran against Intel MKL, including the gfortran option to substitute the BLAS function for matmul, which tests MKL with the addition of the temporary.
You would have to test performance of your intended usage to determine whether allocation and copying of an additional array would matter in your performance. I agree that the additional cache misses cannot be worse than order n**2, but those are more expensive than the order n**3 vectorized in-cache operations.
I brought it up because it nearly doubled the demand for stack space in your example.
In my own tests, with quite different parameters from yours, the additional allocation and copy alone cost about 20% additional run time, largely due to cache misses. I compared performance of matmul in ifort and gfortran against Intel MKL, including the gfortran option to substitute the BLAS function for matmul, which tests MKL with the addition of the temporary.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The compiler knows, because of Fortran language rules, that ALLOCATABLE arrays are contiguous and don't overlap. It doesn't know that about POINTER arrays, and thus creates the temporary result. To do otherwise would require generating two different code paths and doing a run-time test for overlap. Feasible, but not currently implemented and of dubious benefit.
My advice is to use ALLOCATABLE unless you absolutely need the additional features of POINTER.
My advice is to use ALLOCATABLE unless you absolutely need the additional features of POINTER.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good explanation Steve, thanks. So unnecessary use of POINTER can easily lead to stack overflow or significant time spent copying heap.
![](/skins/images/3CECF0550DB8BF54496C114A1FF06FE9/responsive_peak/images/icon_anonymous_message.png)
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page