- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi community,
the code below fail with a seg fault when b is 2,000,000 instead of 200,000.
Module ModOne Type :: ClassOne contains Procedure, Pass :: One => SubOne End type ClassOne Private :: SubOne contains Subroutine SubOne(this) Implicit None Class(ClassOne), Intent(InOut) :: this Integer(kind=8) :: c1,c2,c3,a,b Real(kind=8), Allocatable :: tmp(:) a=4 b=2000000 Allocate(tmp(b)) !$OMP PARALLEL private(tmp) num_threads(2) !$OMP DO Do c1=1,a Do c2=1,a Do c3=1,b tmp(c3)=c3 End Do End Do End Do !$OMP END DO !$OMP END PARALLEL end Subroutine SubOne end Module ModOne Program Test use ModOne Implicit None Type(ClassOne) :: a call a%One() End Program Test
This does not happen with num_threads(1), and not when using gfortran.
I used the commerical ifort15.0 and the academic 16.0, both produce the same result.
Compiler commands where
ifort -heap-arrays -mkl -warn nounused -warn declarations -static -O3 -qopenmp
with mkl flags
MKL= -L$(MKLPATH) -I$(MKLINCLUDE) -lmkl_blas95_lp64 -lmkl_lapack95_lp64 -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_intel_thread.a $(MKLPATH)/libmkl_core.a -Wl,--end-group -liomp5 -lpthread
I cannot see anything wrong with the example.
Any help???
Thanks a lot and Cheers
Karl
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your code should have worked. I will let someone from Intel comment on this. In the mean time here is a work around:
Module ModOne Type :: ClassOne contains Procedure, Pass :: One => SubOne End type ClassOne Private :: SubOne contains Subroutine SubOne(this) Implicit None Class(ClassOne), Intent(InOut) :: this Integer(kind=8) :: c1,c2,c3,a,b Real(kind=8), Allocatable :: tmp(:) a=4 b=2000000 !$OMP PARALLEL firstprivate(tmp) num_threads(2) Allocate(tmp(b)) !$OMP DO Do c1=1,a Do c2=1,a Do c3=1,b tmp(c3)=c3 End Do End Do End Do !$OMP END DO deallocate(tmp) !$OMP END PARALLEL end Subroutine SubOne end Module ModOne Program Test use ModOne Implicit None Type(ClassOne) :: a call a%One() End Program Test
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I should mention though you may find if you fully optimize the above code that the parallel region (or at least the DO portion) gets elided (removed). Your actual code is likely using what you use as tmp in your sample code.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's running out of thread stack space. -heap-arrays has no effect on this. You're asking OpenMP to make threadprivate copies of a huge array, and that requires stack. If you don't have enough stack (even an "unlimited" stack isn't unlimited, and environment variable OMP_STACKSIZE controls per-thread stack), you'll get a segfault.
Jim's workaround is really the best approach if you want tmp to be private within the region.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Steve and Lionel. That works.
However, I have another issue wanting to add elements to a shared matrix in parallel:
Module ModOne Type :: ClassOne Integer :: id contains Procedure, Pass :: One => SubOne End type ClassOne Private :: SubOne contains Subroutine SubOne(this,tmp) Implicit None Class(ClassOne), Intent(InOut) :: this Real, Intent(InOut) :: tmp(:,:) tmp=tmp+this%id end Subroutine SubOne end Module ModOne Program Test use ModOne Implicit None Type(ClassOne), Allocatable :: a(:) Integer :: i=50,j=20000,k Real, Allocatable :: tmp(:,:) Allocate(a(2)) a(1)%id=1 a(2)%id=2 Allocate(tmp(j,i));tmp=0 !$OMP PARALLEL SHARED(tmp) num_threads(2) !$OMP DO Do k=1,2 call a(k)%One(tmp) End Do !$OMP END DO !$OMP END PARALLEL write(50,*) tmp End Program Test
While the program compiles runs and gives correct results, I am more less sure that it contains DATA RACE conditions. To avoid that I thought about a reduction clause (REDUCTION(+:tmp)) but than the program crashes (probably because of running out of stack). While searching for REDUCTION on arrays via google yields results, is not defined for arrays in any openmp manual I could get hold on. Is there any openmp way to achieve that?? I could make local arrays bound to each object and later sum them by a single core, but that seemed to be slower.
Any ideas??
Many Thanks
Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the above code, it appears that you want to take multiple ClassOne objects in array a() and accumulate the scalar of each into all elements array tmp. A proper parallelization would be to have each thread handle a different section of array tmp:
Module ModOne Type :: ClassOne Integer :: id contains Procedure, Pass :: One => SubOne End type ClassOne Private :: SubOne contains Subroutine SubOne(this,tmp) Implicit None Class(ClassOne), Intent(InOut) :: this Real, Intent(InOut) :: tmp(:,:) tmp=tmp+this%id end Subroutine SubOne end Module ModOne Program Test use ModOne Implicit None Type(ClassOne), Allocatable :: a(:) Integer :: i=50,j=20000,k Real, Allocatable :: tmp(:,:) Allocate(a(2)) a(1)%id=1 a(2)%id=2 Allocate(tmp(j,i));tmp=0 !$OMP PARALLEL SHARED(tmp) PRIVATE(k) num_threads(2) !$OMP SECTIONS Do k=1,2 call a(k)%One(tmp(:,1:ubound(k,DIM=2)/2) End Do !$OMP SECTION Do k=1,2 call a(k)%One(tmp(:,ubound(k,DIM=2)/2+1:) End Do !$OMP END SECTIONS !$OMP END PARALLEL write(50,*) tmp End Program Test
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page