Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
1696 Discussions

How to set a variable of a structure to be "threadprivate"

Zhanghong_T_
Novice
628 Views

Dear all,

I need to set some variables of a structure to be "threadprivate" in program with OpenMP like the following:

module omp
integer::ithread
type mydata
integer::data1
!$omp THREADPRIVATE (data1)
end type
!$omp THREADPRIVATE (ithread)
end module

However, when compile the program which uses the module "omp", the following error displayed:

error #6590: This statement is not permitted as a statement within a derived-type-def

Could anyone tell me how to change the code to realize the function?

Thanks,

Zhanghong Tang

0 Kudos
10 Replies
jimdempseyatthecove
Honored Contributor III
628 Views

Dear all,

I need to set some variables of a structure to be "threadprivate" in program with OpenMP like the following:

module omp
integer::ithread
type mydata
integer::data1
!$omp THREADPRIVATE (data1)
end type
!$omp THREADPRIVATE (ithread)
end module

However, when compile the program which uses the module "omp", the following error displayed:

error #6590: This statement is not permitted as a statement within a derived-type-def

Could anyone tell me how to change the code to realize the function?

Thanks,

Zhanghong Tang

Do you require thread private to be
one instance for all objects of the type
or
one instance foreach object if the type?

With a fixed maximum number of threads (2, 4, 8, 16, 32, 64, ...)?
What is the size and number of private data?

The design will depend on the answers to the above questions.

Jim Dempsey

0 Kudos
Zhanghong_T_
Novice
628 Views

Do you require thread private to be
one instance for all objects of the type
or
one instance foreach object if the type?

With a fixed maximum number of threads (2, 4, 8, 16, 32, 64, ...)?
What is the size and number of private data?

The design will depend on the answers to the above questions.

Jim Dempsey

Dear Jim,

Thank you very much for your kindly reply.

1) the maximal number of threads is not determined (current it is 4), but I can set a large number, for example, 999.

2) I wish only part of variables inside the structure to be "threadprivate", since others are allocatable arrays (very large size) and if they are all "threadprivate", the arrays will be allocated for every thread, which will cost large mount of memory. In addition, these arrays are calculated before and will not be changee in the parallel mode.

3) I wish the these variables to be common for ALL subroutines which use the module (since the type is defined inside the module), but should be "threadprivate" for these subroutines.

Thanks,

Zhanghong Tang

0 Kudos
jimdempseyatthecove
Honored Contributor III
628 Views


There are a couple of what we call "gotchas" ((a bug) got you)s that you have to program around such that you do not experience success on your 4 thread system and failure on your 999 thread system.

One of the first things to know is the OpenMP library function OMP_GET_THREAD_NUM returns a 0-based thread number within the current parallel region. What this means is if your program has, or will later have, nested parallel regions you will have multiple threadsthat will receive 0 (1, 2, ...)from OMP_GET_THREAD_NUM. An example is if you have parallel sections and then within the parallel sectionyou have a parallel region (parallel sections, parallel for, parallel whatever), and assuming you enabled nested parallel regions, then you will havemultiple threadsthat will receive 0 (1, 2, ...)from OMP_GET_THREAD_NUM.

Therefore, if you want a global 0-based thread number you have to write your own routine that a thread can call to obtain a unique global thread number. Use something along the line of

[cpp]module mod_ThreadContext

type    TypeThreadContext
SEQUENCE
... define your thread private context
end type TypeThreadContext

integer :: ThreadId

type(TypeThreadContext) :: ThreadContext
! bundle into named common
COMMON /CONTEXT/ ThreadId, ThreadContext
DATA ThreadId/-1/
!$OMP THREADPRIVATE(/CONTEXT/)

integer :: LastThreadId = 0 ! 1-based

integer function GetThreadId()
  GetThreadId = ThreadId
  if(ThreadId .gt. 0) return
  ! first call
  ThreadId = InterlockedIncrement(LastThreadId)
  GetThreadId = ThreadId
end function GetThreadId
[/cpp]

The above code needs cleaning up (add contains, fill out ThreadPrivate, etc...)

The next issue you have is: Once operating system threads are created are they ever deleted prior to program termination?

If all your threads are OpenMP threads then the answer is No (an OpenMP thread, once created, runs for the duration of the applicaiton))

If you create threads by other means then if you never delete them until end of program then the answer is No (your extra threads, once created, runs for the duration of the applicaiton).

If you expressly delete non-OpenMP threads then the answer is Yes.

Now then, if threads are permitted to be deleted, then you have no control over the number of threads that will be created, and therefor the technique of increasing thread Id number is not suitable.

If you do know a maximum number of threads or if you can specify a maximum (and insert test for this)

integer, parameter :: MAX_THREADS = 1024 ! You may want to choose a smaller number
...
type YourType
! global declarations
...
! ThreadId indexed declarations
integer :: Foo(MAX_THREADS)
end type YourType
...
type(YourType) :: Object
...
MyThreadId = GetThreadId() ! get local copy of ThreadId
...
MyFoo = Object%Foo(MyThreadId)

The above is but one way of having thead private variable within the structure.

If the max number of threads is unknown at the time of compilation, or if you do not wish to waste unnecessary amount of memory than at program startup determine the maximum numbers of threads the you will use and then make the thread private portion of your structure and allocatable array of thread private portions

[cpp]type YourType
! global declarations
...
! ThreadId indexed declarations
integer :: Foo(:) ! Must allocate to thread max
end type YourType
[/cpp]

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
628 Views

The code and comments are contradictory with respect to the ThreadId being 0-based or 1-based. Please correct.

Also note that an array descriptor is fairly large. A small array of integersmay besmaller than the array descriptor.

Jim Dempsey

0 Kudos
Steven_L_Intel1
Employee
628 Views

Dear all,

I need to set some variables of a structure to be "threadprivate" in program with OpenMP like the following:

module omp
integer::ithread
type mydata
integer::data1
!$omp THREADPRIVATE (data1)
end type
!$omp THREADPRIVATE (ithread)
end module

There is a terminology problem here. In your example, data1 is not a variable, it is a component of a derived type. You cannot give variable attributes to a component. You can use a THREADPRIVATE directive to name a variable of type mydata, but not individual components of that variable.

0 Kudos
Zhanghong_T_
Novice
628 Views


There are a couple of what we call "gotchas" ((a bug) got you)s that you have to program around such that you do not experience success on your 4 thread system and failure on your 999 thread system.

One of the first things to know is the OpenMP library function OMP_GET_THREAD_NUM returns a 0-based thread number within the current parallel region. What this means is if your program has, or will later have, nested parallel regions you will have multiple threadsthat will receive 0 (1, 2, ...)from OMP_GET_THREAD_NUM. An example is if you have parallel sections and then within the parallel sectionyou have a parallel region (parallel sections, parallel for, parallel whatever), and assuming you enabled nested parallel regions, then you will havemultiple threadsthat will receive 0 (1, 2, ...)from OMP_GET_THREAD_NUM.

Therefore, if you want a global 0-based thread number you have to write your own routine that a thread can call to obtain a unique global thread number. Use something along the line of

[cpp]module mod_ThreadContext

type TypeThreadContext
SEQUENCE
... define your thread private context
end type TypeThreadContext

integer :: ThreadId

type(TypeThreadContext) :: ThreadContext
! bundle into named common
COMMON /CONTEXT/ ThreadId, ThreadContext
DATA ThreadId/-1/
!$OMP THREADPRIVATE(/CONTEXT/)

integer :: LastThreadId = 0 ! 1-based

integer function GetThreadId()
GetThreadId = ThreadId
if(ThreadId .gt. 0) return
! first call
ThreadId = InterlockedIncrement(LastThreadId)
GetThreadId = ThreadId
end function GetThreadId
[/cpp]

The above code needs cleaning up (add contains, fill out ThreadPrivate, etc...)

The next issue you have is: Once operating system threads are created are they ever deleted prior to program termination?

If all your threads are OpenMP threads then the answer is No (an OpenMP thread, once created, runs for the duration of the applicaiton))

If you create threads by other means then if you never delete them until end of program then the answer is No (your extra threads, once created, runs for the duration of the applicaiton).

If you expressly delete non-OpenMP threads then the answer is Yes.

Now then, if threads are permitted to be deleted, then you have no control over the number of threads that will be created, and therefor the technique of increasing thread Id number is not suitable.

If you do know a maximum number of threads or if you can specify a maximum (and insert test for this)

integer, parameter :: MAX_THREADS = 1024 ! You may want to choose a smaller number
...
type YourType
! global declarations
...
! ThreadId indexed declarations
integer :: Foo(MAX_THREADS)
end type YourType
...
type(YourType) :: Object
...
MyThreadId = GetThreadId() ! get local copy of ThreadId
...
MyFoo = Object%Foo(MyThreadId)

The above is but one way of having thead private variable within the structure.

If the max number of threads is unknown at the time of compilation, or if you do not wish to waste unnecessary amount of memory than at program startup determine the maximum numbers of threads the you will use and then make the thread private portion of your structure and allocatable array of thread private portions

[cpp]type YourType
! global declarations
...
! ThreadId indexed declarations
integer :: Foo(:) ! Must allocate to thread max
end type YourType
[/cpp]

Jim Dempsey

Dear Jim,

Thank you very much for your kindly reply. In your code the

[cpp]ThreadContext[/cpp]
is still a structure, not a variable in the structure.

I wish the common and unchanged part (which spend large mount of memory) of the structure will not be private, but the other part be private.

Thanks,

Zhanghong Tang

0 Kudos
Zhanghong_T_
Novice
628 Views

There is a terminology problem here. In your example, data1 is not a variable, it is a component of a derived type. You cannot give variable attributes to a component. You can use a THREADPRIVATE directive to name a variable of type mydata, but not individual components of that variable.

So it seems that I have no better way to let part of the variables in the structure to be private?

Thanks,

Zhanghong Tang

0 Kudos
jimdempseyatthecove
Honored Contributor III
628 Views

Tang,

"better way" depends on the archetecture of your application.

In my previous post, the technique showed:

One set of thread context variables per object.

Where the number of objects is varaible for the duration of the application.

When the number of objects is fixed, or allocated in fixed sized batches, and you use an allocated array of these objects (or an array of pointers to these objects), then the "better way" might be to split the object in two. The shared part allocated to the array descriptor held in the shared address space, and the private part allocated by each thread in the thread private part. (e.g. SharedArrayOfFoo(:) and PrivateArrayOfFoo(:) )

Now then, if thevariables held in the private part are used by each thread as one copy for all object of that type, in a manner conceptually like a thread private static member variable, then you would create a PrivateStaticOfFoo structure in thread private to hold the thread context static part of the Foo objects.

Also, note that you can create a simily of a member function

varnameFoo(aFoo)

Used in place of aFoo%varname

Jim Dempsey

0 Kudos
Zhanghong_T_
Novice
628 Views

Tang,

"better way" depends on the archetecture of your application.

In my previous post, the technique showed:

One set of thread context variables per object.

Where the number of objects is varaible for the duration of the application.

When the number of objects is fixed, or allocated in fixed sized batches, and you use an allocated array of these objects (or an array of pointers to these objects), then the "better way" might be to split the object in two. The shared part allocated to the array descriptor held in the shared address space, and the private part allocated by each thread in the thread private part. (e.g. SharedArrayOfFoo(:) and PrivateArrayOfFoo(:) )

Now then, if thevariables held in the private part are used by each thread as one copy for all object of that type, in a manner conceptually like a thread private static member variable, then you would create a PrivateStaticOfFoo structure in thread private to hold the thread context static part of the Foo objects.

Also, note that you can create a simily of a member function

varnameFoo(aFoo)

Used in place of aFoo%varname

Jim Dempsey

Dear Jim,

Thanks for your kindly reply. At last I selected to depart the private variables from the old structure and then created a new structure. Now I have a problem:

How can I 'broadcast' the private variables calculated by one thread to ALL other threads?

Thanks,

Zhanghong Tang

The following are my code:

module omp
type test
integer::i
end type
type(test)::mytest
!$omp threadprivate(mytest)
end module

subroutine output()
use omp
use omp_lib
implicit none
mytest%i=100
write(*,*)'thread = ',omp_get_thread_num(),'mytest%i=',mytest%i
!$omp parallel
write(*,*)'thread = ',omp_get_thread_num(),'mytest%i=',mytest%i
!$omp end parallel

end

call output
end

0 Kudos
jimdempseyatthecove
Honored Contributor III
628 Views

Encapsulate your thread privatedata you wish to broadcast into an object. Create mailbox slots in global scope, one for each thread. Then write a pointer to the broadcast objecte into each mailbox. On the other thread side, when you observe a non-NULL pointer in the mailbox, use it to freshen your data. Then NULLIFY the pointer as an acknowledgement.

If you are only passing a state or status (e.g. integer, or real) then write the integer or real into the mailbox (assuming it is setup that way).

Note, this is not an event based method, so your reader threads will have to do some polling (or expire to a higher level that does the polling). If you do not want to acknowledge by nulling out the pointer then consider adding an acknowledgementsequence number for each thread in shared storage. This way you have one broadcast mailbox slot and thread number of acknowledgement counters.

If you wish to use event based scheduling then consider using TBB (Threading Building Blocks) or other event based communication techniques. Check MSDN on Microsoft.com.

Jim Dempsey

0 Kudos
Reply