I have written a multi-threaded code using pthread. Each thread calls an instance of dss_solve_real separately. I compile the code using following libraries to make sure that MKL works in sequential mode:
$(MKLROOT)/lib/intel64/libmkl_intel_ilp64.a $(MKLROOT)/lib/intel64/libmkl_sequential.a $(MKLROOT)/lib/intel64/libmkl_core.a -lm -lpthread
Also, I have disabled KMP_AFFINITY using:
The number of threads for MKL is also manually determined in the code using:
I use the following code to set affinity for each thread. This piece code is executed at the beginning of each thread's function:
pthread_t curThread = pthread_self();
sched_setaffinity(curThread, sizeof(cpuset), &cpuset);
In this code, threadCPUNum[threadData->numOfCurThread] represents number of the CPU to which current thread will be binded to.
In order to make sure that MKL respects my CPU affinity settings, I initially bind all the threads to CPU0 by setting all elements of threadCPUNum array to zero. However, monitoring CPU utilization reveals that MKL does not pay attention to sched_setaffinity and uses different processors.
I would like to know what I am missing here and how I can force MKL function (dss_solve_real) to bind to a specific CPU.
Thanks in advance for your help.
As sergey suggestion, how about if with some for-loop without MKL function?
MKL is threaded internal by OpenMP. So if you are linking threaded MKL library in each thread, then it is possible the the OpenMP threads of MKL are not limited by your pthread affinity. But you mentioned, you are using sequential MKL library. so need to verify.
Thank you all for your valuable comments.
I finally resolved my problem. Instead of using sequential, I compiled the code using openmp (threaded version). Then I set KMP_AFFINITY to disabled. Finally, I used sched_setaffinity to bind each thread to specific core. Now, it is working.