- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am writing the fortran code to call a subroutine in parallel loop. Last command in the subroutine is to write an output file (one file on each call). Here's the pseudo code.
program omp use omp_lib implicit none integer :: i,n,a(1000) integer :: b(1000) integer :: fi n=1000 fi=11 open(UNIT=fi,file='inp.txt') !$omp parallel default(private) shared(fi,a,b,n) !$omp do do i=1,n !$omp critical read (fi,*) a(i) !$omp end critical call expo4(a(i),b(i)) end do !$omp end do !$omp end parallel close(fi) open(12,file='out.txt',action='write') do i=1,n write(12,*) b(i) end do close(12) end program subroutine expo4(a,b) implicit none integer, intent(in) :: a integer, intent(out) :: b character*32 :: fname integer :: f b = a**4 write(fname,'(A,I6.6)') 'temp.',a open(f,file=trim(fname),action='write') write(f,*) b close(f) return end subroutine
There were two issues with it, when I started. 1) I used a number "31" for fine unit specified in subroutine, which threads could not differentiate. So, I changed to the a variable. 2) there is a mismatch between what was printed in temp.xxx files and in array b(:). Array b(:) is not sequentially updated as it should be. I am not sure what am I missing here. Looking for help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe it will help you to understand what is happening if you sketch the code you have written in a narrative. This will help you understand what is happening with the layout as your presented above. Knowing what is happening will help you reformulate your method. For simplification assume there are 2 threads available.
You have a file, inp.txt, that has a list of n numbers.
You partition the list sequence n into number of threads (2) number of pieces. First thread 1:n/2, second thread n/2+1:n
The threads may enter your read section in any order and not necessarily alternating reads.
Thus a(1) is not necessarily the first number in the input file (it could be any of the early numbers in the input file)
The first read by the second thread a(n/2+1) could likely be the first number in the input file (it could also be any of the early numbers in the input file).
My interpretation of what you are expecting is for a(1:n) to be in the same sequence as the numbers are in the inp.txt.
If this is what you want to do, then declare a shared picking sequence number that is shared, I prefer to attribute with VOLATILE (but you may use FLUSH in the code). The sequence number is incremented in the critical section, and a private copy of it is made of it for use outside of the critical section. The copy of the sequence number is then used for the index to a(). Doing this will result in a(1:n) representing the input numbers in the order written.
You could consider using a base unit number + the omp_get_thread_num() as the I/O unit number in expo4
or use NEWUNIT specifier in the OPEN
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe it will help you to understand what is happening if you sketch the code you have written in a narrative. This will help you understand what is happening with the layout as your presented above. Knowing what is happening will help you reformulate your method. For simplification assume there are 2 threads available.
You have a file, inp.txt, that has a list of n numbers.
You partition the list sequence n into number of threads (2) number of pieces. First thread 1:n/2, second thread n/2+1:n
The threads may enter your read section in any order and not necessarily alternating reads.
Thus a(1) is not necessarily the first number in the input file (it could be any of the early numbers in the input file)
The first read by the second thread a(n/2+1) could likely be the first number in the input file (it could also be any of the early numbers in the input file).
My interpretation of what you are expecting is for a(1:n) to be in the same sequence as the numbers are in the inp.txt.
If this is what you want to do, then declare a shared picking sequence number that is shared, I prefer to attribute with VOLATILE (but you may use FLUSH in the code). The sequence number is incremented in the critical section, and a private copy of it is made of it for use outside of the critical section. The copy of the sequence number is then used for the index to a(). Doing this will result in a(1:n) representing the input numbers in the order written.
You could consider using a base unit number + the omp_get_thread_num() as the I/O unit number in expo4
or use NEWUNIT specifier in the OPEN
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Jim.
Yes, I wanted to read the data from inp.txt sequentially to process it and store it back to an array. After reading your reply, I realized that inp.txt is read in sequence, but not processed in the same sequence since the parallelism is operated with static schedule (I assume it is same even in dynamic schedule). Main concern I had was that there was no correlation between a(:) and b(:). Since I was not familiar with VOLATILE/FLUSH, so I chose to add additional private variables as below.
!$omp do do i=1,n !$omp critical read (fi,*) i1 !$omp end critical call expo4(i1,i2) a(i)=i1 b(i)=i2 end do !$omp end do
Though ideally I want the arrays be processed in sequence (may be I am so much used to sequential execution and newbie to parallel :-)), above code gave me the correlation between input and output, which I can sort anytime later. Do you see any issues with above modification?
Any simpler way of getting output in same sequence of input is always welcome.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try:
program omp use omp_lib implicit none integer :: i,iLoop,n,a(1000) ! add iLoop integer, volatile :: iVolatile ! add iVolatile (change name if you wish) integer :: b(1000) integer :: fi n=1000 fi=11 iVolatile = 0 ! initialize for pre-increment open(UNIT=fi,file='inp.txt') !$omp parallel default(private) shared(iVolatile,fi,a,b,n) !$omp do do iLoop=1,n !$omp critical iVolatile = iVolatile + 1 ! pre-increment i = iVolatile ! make local copy _inside_ critical section read (fi,*) a(i) !$omp end critical call expo4(a(i),b(i)) ! use local copy as-was inside critical section end do !$omp end do !$omp end parallel close(fi) open(12,file='out.txt',action='write') do i=1,n write(12,*) b(i) end do close(12) end program subroutine expo4(a,b) use omp_lib implicit none integer, intent(in) :: a integer, intent(out) :: b character*32 :: fname integer :: f !$ integer, paramiter :: fBaseUnit=20 ! expanded only if compiled with OpenMP enabled b = a**4 write(fname,'(A,I6.6)') 'temp.',a f = fBaseUnit !$ f = f + omp_get_thread_num() ! expanded only if compiled with OpenMP enabled open(f,file=trim(fname),action='write') write(f,*) b close(f) return end subroutine
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks very much Jim. It worked perfectly as I wanted.
For my understanding: does making a private copy inside critical section forces the parallel sections to execute it order (dynamic schedule with chunk size equal to no. of threads) or its just the storage happens in order with the execution per static schedule?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The critical section executes in thread-arbitrary order. It is whichever thread manages to get the critical section first. The static schedule only assures somewhat equal partitioning of the iteration space (not the sequencing of the thread reads).
Due to the increment and the read being located within the critical section, the thread obtaining the critical section will then know the record number it is about to read, and subsequently have read. Copying record number from shared (and volatile) within critical section assures you use the record number number as it was during the critical section. Use of the iVolatile outside the critical section would induce an error (wrong index) should a different thread pick the next record number.
The parallel do, as previously listed, is appropriate to use whenever you know the work can be evenly distributed. You could use dynamic scheduling (possibly with small or ==1 chunk size). The way this loop is constructed loop{critical section, pick next number, read, end critical section, process picked record} you could use an indefinite do loop:
!$omp parallel default(private) shared(iVolatile,fi,a,b,n) do !$omp critical iVolatile = iVolatile + 1 ! pre-increment i = iVolatile ! make local copy _inside_ critical section if(i .le. n) read (fi,*) a(i) !$omp end critical if(i .gt. n) exit call expo4(a(i),b(i)) ! use local copy as-was inside critical section end do !$omp end parallel
If expo4 workload is unbalanced (different processing load per thing to do), then the above loop will be better (threads do not have a fixed number of items to work on, thus load is distributed to available threads).
While you could use the OpenMP task, this could be unsuitable if the record count is very large (consumes resources and overhead is larger).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I haven't looked into or seen discussed the reasons why a chunk size of 2 is frequently the optimum for dynamic scheduling (provided that it doesn't leave threads with no work). I don't think there is a reason to set chunk size to number of threads. Alternatives to schedule (dynamic) include (guided) and (auto). Any of those may work efficiently, at least for cases where static scheduling would not give any thread more than twice the average amount of work. guided and auto use the chunk size as the minimum, with guided at least starting with a larger chunk if possible, so the chunk size setting may not be important.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page