Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7231 Discussions

Different numerical answers when calling mkl_set_num_threads (1)?

Tony_Garratt
Beginner
777 Views

We are using MKL for some scientific computations and have just put a call to mkl_set_num_threads ( 1 ) in our product (because we found that running 2 executables on a dual processor machine similtaneouslywas about twice as slow without the call). However, the numberswe are getting are now slightly different to before. So, the question is: does setting the number of threads to1 change the numerics of the LAPACKand/or BLAS routines?

Thanks

Tony

0 Kudos
7 Replies
Shane_S_Intel
Employee
777 Views

Changing the number ofcomputational threads may change the numerical characteristics,see this paragraph in the MKL user's guide:

If linear algebra routines (LAPACK, BLAS) are applied to inputs that are bit-for-bit identical but the arrays are differently aligned or the computations are performed either on different platforms or with different numbers of threads, the outputs may not be bit-for-bit identical, though they will deviate within the appropriate error bounds. The Intel MKL version may also affect numerical stability of the output, as the routines may be implemented differently in different versions.

-Shane

0 Kudos
abhimodak
New Contributor I
777 Views
I believe the part of the question pertains to "slow down" by a factor of two when not using 1 thread. Based on the documentation and previous posts, I can see potential for slight differences in the numerics when no. of threads, alignment etc are not identical. But I am confused why does it translate to the big penalty in computational time.

Abhi
0 Kudos
Tony_Garratt
Beginner
777 Views

Thank you for that. The documentation really needs to have this paragraph up front in getting started....

0 Kudos
Todd_R_Intel
Employee
777 Views

Tony,

You mention a slow down if you do not set MKL to run on only one thread. How many cores are there per processor? How many threads are used in each of the two executables, and are you running two instances of the same program or two different executables.

I ask because Abhimentions your slow down and I can imagine a number of scenarios where MKL creates more threads than you would want because it can not know what else you have running on the system.

Todd

0 Kudos
abhimodak
New Contributor I
777 Views
Hi Todd

Let's put it this way:

(1) I have single processor dual core computer (like most sold these days).
(2) Number of threads setting for MKL is left untouched when I build my executable.
(3) I am running two instances of the same executable but using two Java threads.

Suppose I get time = t2 for each job. (The jobs are identical i.e. same input file and generates same output.)

Now I run only 1 instance of the executable and get time = t1. (i.e. I run only one job.)

What would be the relation between t1 and t2? I would like to see t1 = t2. But I am afraid that Tony is getting t2 approximately 2 times t1. That would be make me quite worried. Should I change the number of threads with MKL? I am concerned as it has danger of making my parallelization and/or having more than one processor useless.

May be am just hitting the panic button too early...

Abhi
0 Kudos
TimP
Honored Contributor III
777 Views
If you are parallelizing outside MKL, and using all available cores by starting multiple MKL copies, you probably don't want to permit each MKL copy to start multiple threads.
Assuming you don't want threaded MKL, the safer way is to set up non-threaded MKL (mkl_sequential library in MKL 10).
0 Kudos
Todd_R_Intel
Employee
777 Views
MKL detects the number of cores on the system and by default will use that many threads (there are some nuances to this and we've updated the process used to determine the right number of threads in Update 3, but this is basically true).

So if you're calling MKL from 2 threads and each MKL call uses 2 threads you'll have 4 MKL threads competing for two cores. Even though you're doing the same amount of work as the case were 1 app thread calls MKL to generate 2 threads, the cost of swapping threads in and out will significantly effect your performance.

0 Kudos
Reply