Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6956 Discussions

opnemp dynamically switch on/off in mkl

fenglai
Beginner
288 Views

Hello! I am trying to mix mkl and TBB in my project, therefore I am trying to make MKL dynamically switch on/off as I wish. I use the mkl_set_num_threads() to set the number of threads, when I want the multi-threading mode off, I set it to one before I call the real working function, else if I want to use the multi-threading mode I set mkl_set_num_threads before the working function is called. The code (take BLAS2 as an example) is like this:

   // set omp to run
   if (withOMP) {
      omp_init(); // set it to maximum number of threads we can find with mkl_set_num_threads()
   }else{
      omp_turnoff(); // set the number of threads to 1
   }   

#ifdef WITH_SINGLE_PRECISION
   sgemv(&symA, &row_A, &col_A, &alpha, A, &ld_A, x, &inc_x, &beta, y, &inc_y);
#else
   dgemv(&symA, &row_A, &col_A, &alpha, A, &ld_A, x, &inc_x, &beta, y, &inc_y);
#endif

 however, in my BLAS and LAPACK testing I find if the openmp is turned off, then it can never been switched on again. The running with MKL will always be in serial mode no matter the calls of omp_init(). the multi-threading mode could be observed only if omp_turnoff() is not called.

I am useing MKL 11.1 version of multi-threading library(intel64) together with g++(version 4.7.2, also 64 bit). My library linking is 

-L$(MKLROOT)/lib/intel64 -lmkl_intel_lp64 -lmkl_core  -lmkl_intel_thread -liomp5 -ldl -lpthread -lm

Thanks very much!

fenglai 

0 Kudos
1 Solution
Ying_H_Intel
Employee
288 Views

Hi Fenglai,

The problem is in the call

void openOMP()
{
   //int np = omp_get_max_threads();
   omp_set_num_threads(2);
}
Once you called turnoff OpenMP one time, the system has only 1 OpenMP thread.  Then you call omp_get_max_threads(),  it will return 1 always. so you saw such behavious.  you can change as above, then the thread will work as you wish.

Not sure how you do in MKL.  but MKL should allow you to make MKL dynamically switch on/off.  For example, the example code in MKL userguide.

Best Regards,
Ying

P.S Copy from MKL userguide:

Changing the Number of Threads at Run Time

You cannot change the number of threads during run time using environment variables. However, you can call OpenMP API functions from your program to change the number of threads during run time. The following sample code shows how to change the number of threads during run time using the omp_set_num_threads() routine. See also Techniques to Set the Number of Threads.

The following example shows both C and Fortran code examples. To run this example in the C language, use the omp.h header file from the Intel® compiler package. If you do not have the Intel compiler but wish to explore the functionality in the example, use Fortran API for omp_set_num_threads() rather than the C version. For example, omp_set_num_threads_( &i_one );

// ******* C language *******
#include "omp.h"
#include "mkl.h"
#include <stdio.h>
#define SIZE 1000
int main(int args, char *argv[]){
double *a, *b, *c;
a = (double*)malloc(sizeof(double)*SIZE*SIZE);
b = (double*)malloc(sizeof(double)*SIZE*SIZE);
c = (double*)malloc(sizeof(double)*SIZE*SIZE);
double alpha=1, beta=1;
int m=SIZE, n=SIZE, k=SIZE, lda=SIZE, ldb=SIZE, ldc=SIZE, i=0, j=0;
char transa='n', transb='n';
for( i=0; i<SIZE; i++){
for( j=0; j<SIZE; j++){
a[i*SIZE+j]= (double)(i+j);
b[i*SIZE+j]= (double)(i*j);
c[i*SIZE+j]= (double)0;
}
}
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
printf("row\ta\tc\n");
for ( i=0;i<10;i++){
printf("%d:\t%f\t%f\n", i, a[i*SIZE], c[i*SIZE]);
}
omp_set_num_threads(1);
for( i=0; i<SIZE; i++){
for( j=0; j<SIZE; j++){
a[i*SIZE+j]= (double)(i+j);
b[i*SIZE+j]= (double)(i*j);
c[i*SIZE+j]= (double)0;
}
}
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
printf("row\ta\tc\n");
for ( i=0;i<10;i++){
printf("%d:\t%f\t%f\n", i, a[i*SIZE], c[i*SIZE]);
}
omp_set_num_threads(2);
for( i=0; i<SIZE; i++){
for( j=0; j<SIZE; j++){
a[i*SIZE+j]= (double)(i+j);
b[i*SIZE+j]= (double)(i*j);
c[i*SIZE+j]= (double)0;
}
}
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
printf("row\ta\tc\n");
for ( i=0;i<10;i++){
printf("%d:\t%f\t%f\n", i, a[i*SIZE],
c[i*SIZE]);
}
free (a);
free (b);
free (c);
return 0;
}
    

View solution in original post

0 Kudos
3 Replies
fenglai
Beginner
288 Views

By the way, I also tested the idea purely with openmp itself. The code is like below:

#include <iostream>    
#include <cstdio>    
#include "omp.h"
using namespace std;

void turnoffOMP()
{
   omp_set_num_threads(1);
}

void openOMP()
{
   int np = omp_get_max_threads();
   omp_set_num_threads(np);
}

void testingOMP()
{
#pragma omp parallel 
   {   
      cout << "number of threads " << omp_get_num_threads() << endl; 
   }   
}

int main ()  
{

   // now it's threading part
   openOMP();
   testingOMP();

   //turnoffOMP();
   openOMP();
   testingOMP();

   turnoffOMP();
   testingOMP();

   //turnoffOMP();
   openOMP();  // here is the last call that we want to use openmp
   testingOMP();

   turnoffOMP();
   testingOMP();
   return 0;
}
~                        

I can see that the last call of openOMP() actually does not make it. the output is like this:

./openmp_test 
number of threads number of threads number of threads number of threads number of threads 16161616number of threads                                                                         
number of threads number of threads 16
number of threads 16

16
number of threads 16

16
number of threads 16
16
number of threads 16
number of threads 16
number of threads 16
number of threads 16
number of threads 16

number of threads number of threads number of threads number of threads number of threads number of threads number of threads 16
number of threads 16
number of threads 16
number of threads 16
16
16
number of threads 16
16
16
number of threads 16
number of threads 16
16
number of threads 16
number of threads 16
16
number of threads 16
number of threads 1
number of threads 1
number of threads 1

So the problem could be due to the design of openmp itself. Do you have any ideas about that?

Thanks in advance,

fenglai

0 Kudos
Ying_H_Intel
Employee
289 Views

Hi Fenglai,

The problem is in the call

void openOMP()
{
   //int np = omp_get_max_threads();
   omp_set_num_threads(2);
}
Once you called turnoff OpenMP one time, the system has only 1 OpenMP thread.  Then you call omp_get_max_threads(),  it will return 1 always. so you saw such behavious.  you can change as above, then the thread will work as you wish.

Not sure how you do in MKL.  but MKL should allow you to make MKL dynamically switch on/off.  For example, the example code in MKL userguide.

Best Regards,
Ying

P.S Copy from MKL userguide:

Changing the Number of Threads at Run Time

You cannot change the number of threads during run time using environment variables. However, you can call OpenMP API functions from your program to change the number of threads during run time. The following sample code shows how to change the number of threads during run time using the omp_set_num_threads() routine. See also Techniques to Set the Number of Threads.

The following example shows both C and Fortran code examples. To run this example in the C language, use the omp.h header file from the Intel® compiler package. If you do not have the Intel compiler but wish to explore the functionality in the example, use Fortran API for omp_set_num_threads() rather than the C version. For example, omp_set_num_threads_( &i_one );

// ******* C language *******
#include "omp.h"
#include "mkl.h"
#include <stdio.h>
#define SIZE 1000
int main(int args, char *argv[]){
double *a, *b, *c;
a = (double*)malloc(sizeof(double)*SIZE*SIZE);
b = (double*)malloc(sizeof(double)*SIZE*SIZE);
c = (double*)malloc(sizeof(double)*SIZE*SIZE);
double alpha=1, beta=1;
int m=SIZE, n=SIZE, k=SIZE, lda=SIZE, ldb=SIZE, ldc=SIZE, i=0, j=0;
char transa='n', transb='n';
for( i=0; i<SIZE; i++){
for( j=0; j<SIZE; j++){
a[i*SIZE+j]= (double)(i+j);
b[i*SIZE+j]= (double)(i*j);
c[i*SIZE+j]= (double)0;
}
}
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
printf("row\ta\tc\n");
for ( i=0;i<10;i++){
printf("%d:\t%f\t%f\n", i, a[i*SIZE], c[i*SIZE]);
}
omp_set_num_threads(1);
for( i=0; i<SIZE; i++){
for( j=0; j<SIZE; j++){
a[i*SIZE+j]= (double)(i+j);
b[i*SIZE+j]= (double)(i*j);
c[i*SIZE+j]= (double)0;
}
}
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
printf("row\ta\tc\n");
for ( i=0;i<10;i++){
printf("%d:\t%f\t%f\n", i, a[i*SIZE], c[i*SIZE]);
}
omp_set_num_threads(2);
for( i=0; i<SIZE; i++){
for( j=0; j<SIZE; j++){
a[i*SIZE+j]= (double)(i+j);
b[i*SIZE+j]= (double)(i*j);
c[i*SIZE+j]= (double)0;
}
}
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
printf("row\ta\tc\n");
for ( i=0;i<10;i++){
printf("%d:\t%f\t%f\n", i, a[i*SIZE],
c[i*SIZE]);
}
free (a);
free (b);
free (c);
return 0;
}
    
0 Kudos
fenglai
Beginner
288 Views

Hi Ying,

Thanks for your help! My problem is solved.

I changed the use of omp_get_max_threads() into omp_get_num_procs(). Now I can dynamically switch on/off the MKL multi-threading mode.

Thanks again!

best,

fenglai

0 Kudos
Reply