Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Help with kmp_set_affinity

pollgreen
Beginner
1,358 Views

Hi,

I am attempting to use the OPEN MP kmp_set_affinity_mask function to place certain OMP parallel Sections on a specific processor. I am getting strange results. I have tuna that I use to monitor my system processor usage. I have a two chip quad core (8 processsors total) system. Attached is the simple example code I am running that I stole from the examples in the OPENMP document. When that code runs, it has 3 parallel sections that each run for a fixed amount of time and then exit. That code appears to run as I exepect it to.

I then attempted to add in some kmp_* calls to place the different parallel sections on different processors. When I do that, the code does not run properly and in fact, ithangs forever and I must kill it to stop it.I commented out the code that I was using that did not work, but it is basically this in all 3 sections:

kmp_create_affinity_mask(maskptr);
kmp_set_affinity_mask_proc(3, maskptr);
if (kmp_set_affinity(maskptr) != 0) {
cout << "ERROR: Section #1." << endl;
}

I must obviously be doing something wrong. Any help would be appreciated.

0 Kudos
4 Replies
pbkenned1
Employee
1,358 Views

I think you should try creating a global mask at the beginning of the parallel section, before any of the sections are encountered.In other words, the mask should be shared and fully initialized before any worksharing construct is encountered. I doubt creating a private mask for each thread will work.
Patrick Kennedy

Intel Developer Products

0 Kudos
pollgreen
Beginner
1,358 Views

Thanks. I created 3 separate mask objects instead of the sharing concept and that seemed to work fine. I thought the whole point of private in the section was for each section to have their own variable of that type. I guess I must not understand what PRIVATE is for in this context. Can you explain that to me or point me to where I would read up on its explanation?

Thanks, Rick

0 Kudos
Brian_B_Intel1
Employee
1,358 Views

Either replace all occurrences ofmaskptr with &mask, or else place the statement

maskptr = &mask

at the start of every parallel section. As the example is written, only the space for maskptr is reserved for each thread at the parallel region. Even if the "firstprivate" clause was specified in lieu of "private", then in all threads, maskptr would point to the original / master thread's copy of mask.

I will also note that calling kmp_create_affinity_mask before the parallel section and then relying on a firstprivate directive to copy the value to all threads will not work. openmp objects - omp_lock_t, omp_nest_lock_t, kmp_affinity_mask_t - can be implemented as a deep nest of data object connected via pointers (and are in the Intel OpenMP implementation), and the shallow copy the firstprivate provides will not work.

-bb

0 Kudos
Lev_N_Intel
Employee
1,358 Views

Hi pollgreen,

1. Your source is not completed, compilation fails with:

openmp-example.cpp(3): catastrophic error: could not open source file "mydummy.h"
#include "mydummy.h"

2. "Some initialization" looks buggy:

/* Some initializations */
for (i = 0; i < N; i++)
a = i * 1.5;
b = i + 22.35;

I bet b array is mostly uninitialized, the b is iniatialized (and i actually may be out of array).

3. Value of private variable is undefined in a parallel region, so

kmp_create_affinity_mask(maskptr);

will cause undefined behaviour (becuse maskptr is undefined).

Hints:

1. When calling for help, provide minimal but complete example. (Your example is not complete and far away from being minimal).

2. Instead of specifying variables as "private" use local variables, e.g.:

#pragma omp section
{
printf( "Section #2 executed by thread %d.\n", omp_get_thread_num() );
kmp_affinity_mask_t mask;
kmp_create_affinity_mask( & mask );
kmp_set_affinity_mask_proc( 2, & mask);
if ( kmp_set_affinity( &mask ) != 0 ) {
...

3. printf() works better than << for debug output in a parallel programs. C++ stream output is very "fragile" and likely to be intermixed with output of other threads.

0 Kudos
Reply