- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am in the middle of learning openMP and want to apply openMP to our existing software.
I have some deferred shaped array that is defined before parallel region, and will be used after the paralle region.
These arrays are also modified in the parallel portion of the code. However, I found that deferred shaped array is not permitted in an openMP firstprivate, lastprivate or reduction. I really do not want to change deferred array. What's your advise? Thanks.
I am in the middle of learning openMP and want to apply openMP to our existing software.
I have some deferred shaped array that is defined before parallel region, and will be used after the paralle region.
These arrays are also modified in the parallel portion of the code. However, I found that deferred shaped array is not permitted in an openMP firstprivate, lastprivate or reduction. I really do not want to change deferred array. What's your advise? Thanks.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why are you making the array private?
Usually your parallization writes stripes of the output array.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
Why are you making the array private?
Usually your parallization writes stripes of the output array.
Jim
Jim,
I have nested loops in my code with about 8 do loops nested together. The arrays are initialized at the outmost loop and updated at the innermost loop and then used again at outmost loop. However, I can only do parallel region at
4th inner do loop, since the outer loop also reads the scratch files from the harddisk which can not be parallized.
The structure of my loop can be described in the follow simple example. I can only parallel the inner loop & array A is the deferred shaped array. What should I do?
Do I=1,N
read some data from hard disk and assign to array B & C
calculate ii,jj
A(ii,jj) = B(ii,jj) + C(ii,jj)
do k = 1,M
calculate array Z
A(ii,jj)=A(ii,jj)+Z(ii,jj,k)
enddo
write array A and someother array data to harddisk for later use
enddo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - maria
Jim,
I have nested loops in my code with about 8 do loops nested together. The arrays are initialized at the outmost loop and updated at the innermost loop and then used again at outmost loop. However, I can only do parallel region at
4th inner do loop, since the outer loop also reads the scratch files from the harddisk which can not be parallized.
The structure of my loop can be described in the follow simple example. I can only parallel the inner loop & array A is the deferred shaped array. What should I do?
Do I=1,N
read some data from hard disk and assign to array B & C
calculate ii,jj
A(ii,jj) = B(ii,jj) + C(ii,jj)
do k = 1,M
calculate array Z
A(ii,jj)=A(ii,jj)+Z(ii,jj,k)
enddo
write array A and someother array data to harddisk for later use
enddo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>However, I can only do parallel region at 4th inner do loop, since the outer loop also reads the scratch files from the harddisk which can not be parallized.
This is not necessarily true. What you have here is a candidate for a parallel pipeline. parallel_pipeline is supported in TBB (www.threadingbuildingblocks.org and Intel's website somewhere). Also my product QuickThread (www.quickthreadprogramming.com) supports parallel_pipeline.
*** HOWEVER ***
Prior to investigating conversion away from OpenMP (considerable effort), there are a few tricks you can do to improve parallelization of your code using OpenMP. Try a state driven parallel section.
I will outline this in incomplete pseudo code (you convert to Fortran, add data structuresand tidy up)
bBegin = .false.
bEnd = .false.
!$omp parallel
if(omp_get_thread_num() == 0) then
! master thread
bBegin = .true. ! activate other team members
iIn = 1
iOut = 1
do while(.not. bEnd)
if(availableInputBuffer(whichBuffer) .and. (iIn < N)) then
readIntoInputBufferAndMarkAsReady(whichBuffer)
iIn = iIn + 1
else if(haveOutputBuffer(whichBuffer)) then
writeToOuputFile(whichBuffer)
iOut = iOut+1 ! assumessequential writes
else if(haveDataToProcess(whichBuffer)) then
processBuffer(whichBuffer) ! also marks buffer as done
else
if(iOut > N) then
bEnd = .true. ! assumessequential writes
else
Sleep(0) ! or _mm_pause()
endif
endif
! end of master thread section
else
! thread not 0 (worker threads)
do while(.not. bBegin)
Sleep(0) ! or _mm_pause()
end do
do while(.not. bEnd)
if(haveDataToProcess(whichBuffer)) then
processBuffer(whichBuffer)
else
Sleep(0) ! or _mm_pause()
endif
enddo
endif
end do
!$omp end parallel
The above can be modified such that team member thread 0 does reads (and process of buffers) and team memberthread 1 does writes (and process of buffers), all other threads only process buffers.
I assume you will figure out that you will need omp_get_num_thread() number of sets ofbuffers but the buffers are not dedicated to specific threads. Buffers are thread-safe acquired for processing.
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page