- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The purpose of this example is to find out how a program behaves when both the upper layer (Python) and the lower layer (Fortran) use their threading capabilities.
The Python3 caller script launches 2 concurrent instances of the same Fortran routine using the Thread class.
The Fortran callee is OpenMP parallel with OMP_NUM_THREADS set to 4.
Looking at the printout it looks like the 2 concurrent OpenMP instances don't interfere with each other, both have their threads numbered from 0 to 3. This is nice as it allows to put a level of parallelism on top of OpenMP. But the question is: Is such nested parallelism a supported feature or is this just an accidental observation ?
-------------------- Output printed --------------------
>python3 pydot.py Thread 2 Thread 3 Thread 0 Thread 1 Thread 0 Thread 1 1000.0 Thread 2 Thread 3 500.0
-------------------- Fortran code ----------------------
! Compile to shared library doing
! ifort -g -fPIC -shared -openmp -o ddot.so ddot.f90
real(8) function ddot(n,a,b) bind(c)
use, intrinsic :: iso_c_binding
use omp_lib
integer(4), value, intent(in) :: n
real(8), dimension(1:n), intent(in) :: a
real(8), dimension(1:n), intent(in) :: b
integer :: i
ddot = 0.0
!$OMP PARALLEL
print *,'Thread', omp_get_thread_num()
!$OMP DO REDUCTION(+:ddot)
do i = 1, n
ddot = ddot + a(i) * b(i)
end do
!$OMP END PARALLEL
end function
------------------ Python code ----------------------
from ctypes import *
from threading import Thread
# Load shared library
flib = cdll.LoadLibrary('./ddot.so')
# Prototype for external function
veclen = 1000
flib.ddot.argtypes = [c_int, ARRAY(c_double, veclen), ARRAY(c_double, veclen)]
flib.ddot.restype = c_double
# Array construction
rvect = c_double * veclen
a = rvect()
b = rvect()
c = rvect()
for i in range(0,veclen):
a = 0.5
b = 2.0
c = 1.0
# Python wrapper
def pydot(a,b):
pydot.argtypes = [ARRAY(c_double, veclen), ARRAY(c_double, veclen)]
res = flib.ddot(c_int(veclen), a, b)
print(res)
# Do it using 2 threads
# Create
t1 = Thread(target=pydot, args=(a,b,))
t2 = Thread(target=pydot, args=(a,c,))
# Launch
t1.start()
t2.start()
# Wait for completion
t1.join()
t2.join()
# End
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A major misconception of OpenMP programers is:
omp_get_thread_num()
returns an application-wide thread number. This is incorrect.
Instead, it returns the team member number. These numbers are the same when nested parallelism is NOT used.
In your example, each Python thread instantiates its own base-level OpenMP thread pool (no different than an OpenMP thread instantiating a nested thread pool). This is by design. Note, there may be a means whereby you can specify the behavior of having a single OpenMP thread pool (may be more useful for OpenMP Task constructs).
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page