Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
23 Views

Problem with nested parallelism using OpenMP and MKL fft

Hi folks.

I had a problem with nested parallelism using OpenMP and MKl fft.

The part of my code that uses MKL calls within omp parallel region looks like this:

 

/////////////////////////////

omp_set_nested(1);

omp_set_num_threads(m);

mkl_domain_set_num_threads(n, MKL_DOMAIN_FFT);

#pragma omp parallel

{

//some code

#pragma omp single

{

//call of the fft

}

}

/////////////////////////////

The problem is that when m > 1 fft call is working with only 1 thread regardless of value of n.

I will be grateful for any help.

Regards.

 

0 Kudos
2 Replies
Highlighted
Employee
23 Views

Hi Mihran

How about  Set the MKL_DYNAMIC environment variable to FALSE or call mkl_set_dynamic(0) to use the suggested number of OpenMP threads whenever the algorithms permit and regardless of OpenMP overhead and data locality.

Basically, the Intel MKL-specific threading controls take precedence over their OpenMP equivalents.  But when MKL was in nested OpenMP parallel region , we recommend to use 1 mkl thread for better performance.  and that is reason why MKL run in sequential model by default.

But  you can control them  as MKL developer guide mentioned:  https://software.intel.com/en-us/node/528550

If your application uses OpenMP* threading, you may need to provide additional settings:

  • Set the environment variable OMP_NESTED=TRUE, or alternatively call omp_set_nested(1), to enable OpenMP nested parallelism.
  • Set the environment variable MKL_DYNAMIC=FALSE, or alternatively call mkl_set_dynamic(0), to prevent Intel MKL from dynamically reducing the number of OpenMP threads in nested parallel regions.

Best Regards,

Ying

Some nested parallel tips in MKL user guide for your reference:

https://software.intel.com/en-us/node/528546#92D6DAD0-A858-4824-9A90-AC2AD2A9C2E1

You parallelize the program using OpenMP directives and/or pragmas and compile the program using a non-Intel compiler.

To avoid simultaneous activities of multiple threading RTLs, link the program against the Intel MKL threading library that matches the compiler you use (see Linking Examples on how to do this). If this is not possible, use Intel MKL in the sequential mode. To do this, you should link with the appropriate threading library: libmkl_sequential.a or libmkl_sequential.so (see Appendix C: Directory Structure in Detail).

 

0 Kudos
Highlighted
Beginner
23 Views

Hi Ying,

Thank you a lot for your response. It was very informative and mkl_set_dynamic(0) helped me, so now my code works as I want.

Regards,

Mihran.

0 Kudos