Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Effeciently parallelizing the code in fortran

Ajay_N_
Beginner
623 Views

Hi,

I am trying to parallelize a certain section of my code which is written in fortran. The code snippet looks as below:

do i=1,array(K)

j = K
... if conditions on K...
....write and reads on j...
... do lot of things ...
K = K+1

So I tried to parallelize using the below code.. which was obviously not as it should have been

!$OMP PARALLEL DO PRIVATE(j)
do i=1,50
j = K
... if conditions on K...
....write and reads on j...
... do lot of things ...
K = K+1
!$OMP END PARALLEL DO

 

The obvious mistake being, all the threads race to the same K. What would be the best way to ensure every thread gets assigned an incremental K and the threads run in parallel.

Thanks
Ajay

0 Kudos
1 Solution
Andrey_Vladimirov
New Contributor III
623 Views

It is hard to say without a reproducer code. Perhaps some variables that should have been private were declared or assumed shared. Here is a code that worked for me:

[fortran]

program omploop

  include 'omp_lib.h'

  integer :: i, j, thread_id

  write(*,'(3A10)') "Thread", "i", "j"

!$OMP PARALLEL DO PRIVATE(i, thread_id, j)
  do i = 1, 10
     thread_id = omp_get_thread_num()
     j = i_to_j(i)
     write(*,'(3I10)') thread_id, i, j
  end do

contains

  pure integer function i_to_j(i)
    integer, intent(in) :: i
    i_to_j = i + 7
  end function i_to_j

end program omploop


[/fortran]

 

Compilation and execution:

[bash]

ifort  -openmp omploop.f90 -o omploop

./omploop

    Thread         i         j
         0         1         8
         2         7        14
         1         4        11
...

[/bash]

 

View solution in original post

0 Kudos
13 Replies
Andrey_Vladimirov
New Contributor III
623 Views

If work on K=1 is not dependent on the results of K=0, and work on K=2 is not dependent on the results of K=1, etc., then you should just express a mapping between the value of i and the value of j, rather than using the shared counter K.

However, if you must start processing K=0 before you can start processing K=1, and must start K=1 before you can start K=2, etc., then you should look into the ORDERED clause of OMP DO: http://software.intel.com/en-us/node/466858

0 Kudos
Ajay_N_
Beginner
623 Views

Thanks Andrev,

I was able to map j to values of i. Now I face a slightly different problem pertaining to openMP i think.

!$OMP PARALLEL DO PRIVATE(j)
do i=1,50
j = somefunction(i)
print thread_id, i
print thread_id, j


When you write a openMP like above, i noticed that say j gets computer for a value of i that is spawned by the next thread. Say the function is i+7
Example
Thread 0 i: 5
Thread 0 j: 9
Thread 1 i: 2
Thread 1 j: 11

Any reason why this should happen ?

Thanks
Ajay

 

 

0 Kudos
Andrey_Vladimirov
New Contributor III
624 Views

It is hard to say without a reproducer code. Perhaps some variables that should have been private were declared or assumed shared. Here is a code that worked for me:

[fortran]

program omploop

  include 'omp_lib.h'

  integer :: i, j, thread_id

  write(*,'(3A10)') "Thread", "i", "j"

!$OMP PARALLEL DO PRIVATE(i, thread_id, j)
  do i = 1, 10
     thread_id = omp_get_thread_num()
     j = i_to_j(i)
     write(*,'(3I10)') thread_id, i, j
  end do

contains

  pure integer function i_to_j(i)
    integer, intent(in) :: i
    i_to_j = i + 7
  end function i_to_j

end program omploop


[/fortran]

 

Compilation and execution:

[bash]

ifort  -openmp omploop.f90 -o omploop

./omploop

    Thread         i         j
         0         1         8
         2         7        14
         1         4        11
...

[/bash]

 

0 Kudos
Ajay_N_
Beginner
623 Views

I just tried a practical example for four threads:


!$OMP PARALLEL PRIVATE(k,id)

    !$OMP DO

    DO i = 1, 10

       count = i+5 

       print *,"id: ",OMP_GET_THREAD_NUM()," i    : ",i

       print *,"id: ",OMP_GET_THREAD_NUM()," count: ",count

    end do

    !$OMP END DO

    !$OMP END PARALLEL

Notice for the first thread 2, is 7 and the count for thread 2 should have been i+5 so 12 but the output is 14.

Output:
 id:            2  i    :            7
 id:            1  i    :            4
 id:            0  i    :            1
 id:            3  i    :            9
 id:            2  count:           14
 id:            1  count:           13
 id:            0  count:           10
 id:            3  count:            7
 id:            2  i    :            8
 id:            1  i    :            5
 id:            0  i    :            2
 id:            3  i    :           10
 id:            2  count:           15
 id:            1  count:           15
 id:            0  count:           11
 id:            3  count:            8
 id:            1  i    :            6
 id:            0  i    :            3
 id:            1  count:            8
 id:            0  count:            8

Thanks
Ajay

0 Kudos
Ajay_N_
Beginner
623 Views

Sorry for the formatting, for some reason the code quotes dont seem to work for me !! 

0 Kudos
Andrey_Vladimirov
New Contributor III
623 Views

Right, you need to make "count" a private varible.

0 Kudos
Ajay_N_
Beginner
623 Views

Hi Andrev, 

A follow up question. if there are global variables that I have made private to an openMP do loop. And there is a function that is being called within the do loop which makes use of the same global variables that I have made private. Please note that I am not passing the variables in the function, global variables are accessible to all the functions. 

Does the function that is being called in the do loop get the private copy of the calling loop or does it behave unexpectedly? I am not making any changes to the called function.

I think I am observing ambiguity for this case and is there a way to tackle this?

Thanks

Ajay

0 Kudos
jimdempseyatthecove
Honored Contributor III
623 Views

Pass the private copy of the (otherwise) global variable as an argument. Not doing so results in the called routine using the actual global variable and not the thread's copy.

If the(se) global variable(s) are thread specific, as opposed to being application global, and you are using global variables as a means to reduce the number of arguments, then consider either passing a reference to a thread context (containing the collection of variables you want specific to each thread) or use threadprivate variables. Threadprivate will have least impact on code.

Jim Dempsey

0 Kudos
Ajay_N_
Beginner
623 Views

Hi Jim,

Thanks for the reply, yes the arguments are not being passed lets say because it was built that way by some one ! And there are lot of functions that I will have to go and make the change. Is there a way to make the section of the parallelized code ensure that the calling function gets a private copy without me going to all functions and make the changes?

 

Ajay

0 Kudos
jimdempseyatthecove
Honored Contributor III
623 Views

The only way to do this is to declare the variables as threadprivate (global) variables. These variables have global namespace but thread local address space.

module temp
real(8) :: A(3),B(3),C(3)
!$OMP THREADPRIVATE(A,B,C)
...
end module temp
...
subroutine usingTemp(X,F)
use temp
real :: X(3)
A=X ! copies X to threadprivate A
...
end subroutine usingTemp

subroutine foo
!$OMP PARALLEL DO
do i=1,nObjs
    call usingTemp(Obj(i)%pos) ! pass in objects position

The caveat here is the thread private variables are thread private. i.e. they are not global singularities, they are global multiplicities.

Jim Dempsey

0 Kudos
Ajay_N_
Beginner
623 Views

Hi Jim,

Thanks for your example, but my case is probably a little different. 
gloabl header file header.f90
integer :: X;

Main function:
do i=1,n:
   call subroutine(i)
end do

function subroutine:
X = some function on (i)

Now the issue is I want X to have a private copy when called within an openmp thread:

!$OMP PARALLEL DO (private X)

only ensures X is private within the do loop but not on the called functions. The moment a function is called from the openMP  thread, the function accesses the global variable. 
I know passing the variables solves this issue, but I am trying to parallelize an already existing code which is very huge and looking for a quick workaround if possible !

Thanks
Ajay

0 Kudos
jimdempseyatthecove
Honored Contributor III
623 Views

If you make X treadprivate, as outlined earlier, each thread, including the main thread, will have a global variable X that is private to that thread. When the X's are truly independent then there will be no issue. The only issue you may have is that you may be required to reduce the multiple instances of X into a preferred choice for use by the thread outside the region. An example of this could be constructed out of finding the minimum X in an array (using the threadprivate global X's).

gloabl header file header.f90
integer :: X;
!$OMP THREADPRIVATE(X)
----
integer :: Xcopy ! in source with main function

Main function:
Xcopy = 999999 ! some value on the other side of the solution HUGE(X)
!$OMP PARALLEL
!$OMP DO
do i=1,n:
   call subroutine(i)
end do
!$OMP CRITICAL ! or !$OMP ATOMIC for newer version of OpenMP 
if(X .lt. Xcopy) Xcopy = X
!$OMP END CRITICAL
!$OMP END PARALLEL
X = Xcopy

function subroutine:
X = some function on (I) ! X is threadprivate

Note, newer versions of OpenMP may have a MIN and MAX reduction operator

Jim Dempsey

0 Kudos
Ajay_N_
Beginner
623 Views

Thank you so much !! I did not think threadprivate and private are different until now. I thought both were same !! Will experiment with this a bit more.

Thank you again.
Ajay

0 Kudos
Reply