When linking the latest version of MKL with my c++ program with OpenMP and running the program multiple times, it sometimes parallelizes and sometimes it doesn't. I was wondering if someone has noticed something like this before?
Couple of notes:
- the program uses both TBB (for parallelism in the rest of the code) and OpenMP for MKL. There are no nested MKL calls
- if TBB is used for both, it works but pardiso is only partially parallelized in TBB and is thus significantly slower
- different parallelism is observed without recompiling
- sometimes one thread and sometimes the proper amount
- the TBB library used is the one provided with MKL
- MKL version is 2018 U1
- the OS is windows 10
- I have tried using mkl_set_num_threads, omp_set_num_threads, omp_set_dynamic(false) and mkl_set_dynamic(false). I have also tried setting the blocktime to no avail
Hopefully someone has an answer to this problem
Could you please clarify the following:
How do you link this application? i mean do you link with mkl_tbb_thread or mkl_intel_thread libs?
What is the typical problem size you solve?
what does this mean: >> if TBB is used for both, it works but pardiso is only partially parallelized in TBB and is thus significantly slower
Sorry for not being clear in the original message.
If the program is linked with mkl_tbb_thread, mkl runs in parallel but pardiso does not seem to be fully converted to support it yet.
If the program is linked with mkl_intel_thread, (in addition to libiomp5.lib) mkl routines seem to sometimes run in parallel and sometimes not.
- This behavior change can be observed WITHOUT recompiling which is the confusing part
It should be noted that in both cases the program is also linked to tbb.lib since the rest of the program needs it
Problems are sparse but can vary from 10K*10K to 1.5M*1.5M