- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Issue : Core assign and parallel processing don't work when using Intel openMP Library(i.e., libiomp5)
Compiler : GCC
OS : RHEL 6.4, 6.6, 7.7, and 8.4 with Intel OneAPI 2022 installed
==============================================================================
Hello.
I faced with some difficult that core assign and parallel processing on my code don't work well when I using Intel openMP library, i.e., libiomp5.
Since I should use Intel MKL Library like specially libmkl_intel_thread.a and libmkl_intel_core.a, it is essential using libiomp5.so.
In addition, I should use #pragma omp parallel processing with core assign such as CPU_SET and pthread_setafficity.
When I tested my test code as using libiomp5, the figure of core load(result of top command on the shell) looks so strange.
My test code generates two threads those first thread used core#0, #2 and #4 and others used core#1, #3 and #5. If normal work, core#0 ~ #5 are under 100% load because two threads are on the infinite loop without sleep. However only core#0, #2 and #4 are under 100% load.
Strangely, that doesn't mean the thread assigned on core#1, #3 and #5 was not created. This thread was made but cannot check CPU load.
One more interesting thing is all cores work well when I changed iomp5 as gomp. As following 2 pictures are core load statement when I use iomp5 and gomp separately.
iomp result : only core#0, #2 and #4 are under 100% load but two threads works.
compile command : gcc Test_iomp5.c -o Test_iomp5.out -D_GNU_SOURCE -fopenmp -ldl -liomp5 -lpthread -L/opt/intel/oneapi/compiler/2022.0.0/linux/compiler/lib/intel64
gomp result : core#0 ~ #5 are under 100% load.
compile command : gcc Test_iomp5.c -o Test_iomp5.out -D_GNU_SOURCE -fopenmp -ldl -gomp -lpthread -L/opt/intel/oneapi/compiler/2022.0.0/linux/compiler/lib/intel64
Such condition, what should I do to fix this problem?
Specially, I found the way to solve this problem that is using gomp as well as iomp5 as following compile command but I am afraid compatibility between gomp and iomp5 when use both.
compile command : gcc Test_iomp5.c -o Test_iomp5.out -D_GNU_SOURCE -fopenmp -ldl -gomp -iomp5 -lpthread -L/opt/intel/oneapi/compiler/2022.0.0/linux/compiler/lib/intel64
I want to know
1) How can I use #pragma omp parallel processing on the multi threads when I used iomp5?
2) If I should use both gomp and iomp5, is that no problem in terms of compatibility each other?
Please let me solve this issues, Sincerely.
If any question and to do test, do not hesitate on reply
============= My Test Code : Test_iomp5.c ============
#include <stdio.h>
#include <pthread.h>
#include <sched.h>
#include <omp.h>
#define MAX_THREAD_NUM 2
#define MAX_OMP_NUM 6
extern void* DoWork1(void *args)
{
long lLoop;
#pragma omp parallel for private(lLoop) num_threads(MAX_OMP_NUM)
for(lLoop = 0 ; lLoop < 10000000000 ; lLoop++)
{
lLoop = 0;
}
printf("Thread 1 Done @ core 0 & 2 & 4\n");
return NULL;
}
extern void* DoWork2(void *args)
{
long lLoop;
#pragma omp parallel for private(lLoop) num_threads(MAX_OMP_NUM)
for(lLoop = 0 ; lLoop < 10000000000 ; lLoop++)
{
lLoop = 0;
}
printf("Thread 2 Done @ core 1 & 3 & 5\n");
return NULL;
}
int main(int const argc, char *const argv)
{
int iLoop;
pthread_t threads[MAX_THREAD_NUM];
pthread_attr_t attr;
(void) pthread_attr_init(&attr);
for(iLoop = 0 ; iLoop < MAX_THREAD_NUM ; iLoop++)
{
cpu_set_t mask;
CPU_ZERO(&mask);
if(iLoop == 1)
{
CPU_SET(1, &mask);
CPU_SET(3, &mask);
CPU_SET(5, &mask);
(void) pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &mask);
(void) pthread_create(&threads[iLoop], &attr, DoWork2, (void *)iLoop);
(void) printf("Thread %d CoreSetMask 0x%X\n", iLoop+1, mask.__bits[0]);
}
else
{
CPU_SET(0, &mask);
CPU_SET(2, &mask);
CPU_SET(4, &mask);
(void) pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &mask);
(void) pthread_create(&threads[iLoop], &attr, DoWork1, (void *)iLoop);
(void) printf("Thread %d CoreSetMask 0x%X\n", iLoop+1, mask.__bits[0]);
}
}
for(iLoop = 0 ; iLoop < MAX_THREAD_NUM ; iLoop++)
{
(void) pthread_join(threads[iLoop], NULL);
}
}
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page