- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi folks.
I had a problem with nested parallelism using OpenMP and MKl fft.
The part of my code that uses MKL calls within omp parallel region looks like this:
/////////////////////////////
omp_set_nested(1);
omp_set_num_threads(m);
mkl_domain_set_num_threads(n, MKL_DOMAIN_FFT);
#pragma omp parallel
{
//some code
#pragma omp single
{
//call of the fft
}
}
/////////////////////////////
The problem is that when m > 1 fft call is working with only 1 thread regardless of value of n.
I will be grateful for any help.
Regards.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mihran
How about Set the MKL_DYNAMIC environment variable to FALSE or call mkl_set_dynamic(0)
to use the suggested number of OpenMP threads whenever the algorithms permit and regardless of OpenMP overhead and data locality.
Basically, the Intel MKL-specific threading controls take precedence over their OpenMP equivalents. But when MKL was in nested OpenMP parallel region , we recommend to use 1 mkl thread for better performance. and that is reason why MKL run in sequential model by default.
But you can control them as MKL developer guide mentioned: https://software.intel.com/en-us/node/528550
If your application uses OpenMP* threading, you may need to provide additional settings:
- Set the environment variable OMP_NESTED=TRUE, or alternatively call omp_set_nested(1), to enable OpenMP nested parallelism.
- Set the environment variable MKL_DYNAMIC=FALSE, or alternatively call mkl_set_dynamic(0), to prevent Intel MKL from dynamically reducing the number of OpenMP threads in nested parallel regions.
Best Regards,
Ying
Some nested parallel tips in MKL user guide for your reference:
https://software.intel.com/en-us/node/528546#92D6DAD0-A858-4824-9A90-AC2AD2A9C2E1
You parallelize the program using OpenMP directives and/or pragmas and compile the program using a non-Intel compiler. |
To avoid simultaneous activities of multiple threading RTLs, link the program against the Intel MKL threading library that matches the compiler you use (see Linking Examples on how to do this). If this is not possible, use Intel MKL in the sequential mode. To do this, you should link with the appropriate threading library: libmkl_sequential.a or libmkl_sequential.so (see Appendix C: Directory Structure in Detail). |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ying,
Thank you a lot for your response. It was very informative and mkl_set_dynamic(0) helped me, so now my code works as I want.
Regards,
Mihran.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page