- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A bit of F90 code attached that I am using to see about speeding up the reading in of a number of files. I have 3 OpenMP parallelisation techniques:
(a) allocate array INPUTS, then enter PAR with PRIVATE(INPUTS) - since each thread reads its own data in to INPUTS from a different file
(b) within the PAR DO combo with PRIVATE(INPUTS), then allocate-read-deallocate per iteration
(c) have a PAR fork with PRIVATE(INPUTS), then allocate INPUTS on each thread, then a distributed DO to share reading of files over threads
but despite trying on different machines and different versions of Intel ifort compiler, for (a) even with 2 threads I get a sigsegv, sometimes at entry to the PAR DO but sometimes on the 3rd iteration of the loop, whereas the other 2 approaches both work fine. As far as I can see the memory usage is no bigger for (a) so why the sigsegv eg:
file read: chkSum: 5243196.
file closed successfully
file read: chkSum: 5243196.
file closed successfully
PAR-(c) reads took: 41.4187059402466
Program received signal SIGSEGV, Segmentation fault.
0x0000000000406c67 in L_MAIN___132__par_loop3_2_2 () at thrasher.f90:132
132 !$OMP PARALLEL DO DEFAULT(NONE) PRIVATE(filename, myStatus, stream, inputs)
Missing separate debuginfos, use: debuginfo-install libgcc-4.4.7-23.el6.x86_64
(gdb) l
127 ! read files in PARALLEL (a) single alloc pre-PAR
128 start=omp_get_wtime()
129 allocate(inputs(numReals), stat=myStatus)
130 if (myStatus /= 0) stop 'error allocating par-(a) inputs'
131
132 !$OMP PARALLEL DO DEFAULT(NONE) PRIVATE(filename, myStatus, stream, inputs)
133 do i=1, numFiles
134 stream=50+i
135 filename(1:5)="fort."
136 filename(6:8)=val2str(i)
(gdb) quit
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
! read files in PARALLEL (a) single alloc pre-PAR start=omp_get_wtime() !$OMP PARALLEL DEFAULT(NONE) PRIVATE(filename, myStatus, stream, inputs, myStatus) SHARED(numReals) allocate(inputs(numReals), stat=myStatus) if (myStatus /= 0) stop 'error allocating par-(a) inputs' !$OMP DO do i=1, numFiles stream=50+i filename(1:5)="fort." filename(6:8)=val2str(i) write(*,*) 'opening file: ', filename open(stream, file=filename, status="old", form="formatted", action="read", iostat=myStatus) if (myStatus /= 0) stop 'error opening existing file' read(stream, *) inputs(1:numReals) write(*,*) ' file read: chkSum:', sum(inputs) close(stream, iostat=myStatus) if (myStatus /= 0) stop 'error closing file' write(*,*) ' file closed successfully' end do !$OMP END DO deallocate(inputs) !$OMP END PARALLEL write(*,*) 'PAR-(a) reads took:', omp_get_wtime()-start
If that has issues, then change inputs from PRIVATE to FIRSTPRIVATE
** the above requires that the array input be not allocated prior to entry into the parallel region.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jim, many thanks for taking time to reply. However, sorry am confused in that the code snippet you posted to fix "PAR-(a)" seems to be one of the solutions I had already (specifically "PAR-(c)") which I agree is a nice solution. (NB numReals is a PARAMETER so no data clause required)
My question is as to why the "PAR-(a)" gives a sigsegv. I'm sure most people's first attempt would be to allocate memory and then set to PRIVATE if it is indeed only temp/local/scratch array. BUT I suspect my issue may be not be to do with alloc/PRIVATE but rather to do with memory (stacksize?)
So on that that, I did a clone the code, remove the PAR-(b) and PAR-(c) and replace the allocatable array for PAR-(a) with a statically defined and still hit issues.
At which point I recalled OMP_STACKSIZE and - drum roll please - having set that (to 100M in this case) all options work nicely it seems (*** testing ongoing)
Yrs, M
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>My question is as to why the "PAR-(a)" gives a sigsegv.
Private variables/arrays are preponderantly small and therefore allocated on stack (of the non-master thread). This isn't an issue unless the private arrays are large/huge. And in which case you run the risk of stack overflow.
In the past (I haven't verified this on V19.nn) PRIVATE(array) had issues when the array was unallocated. This necessitated using FIRSTPRIVATE on the array.... such that the empty array descriptor was copied (thus remaining empty, as opposed to junk data).
>>OMP_STACKSIZE and - drum roll please - having set that (to 100M in this case)
While this fixed this in this case, it might not be suitable to do this in all cases (or future cases of your application). Consider what happens on a manycore system, e.g. one with 256 hardware threads, where it is beneficial to !$OMP PARALLEL DO num_threads(some_subset).
Or, in your specific case, suppose on a system with 256 hardware threads and numFiles == 32
For these reasons, it is much better to add 4 lines of code, and perform the allocations only when and where applicable.
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page