I have a Xeon E5-2620 processor and benchmarking with SGEMM. Why does MKL spurn only 6 threads (hardware threads) instead of the expected 12 threads (hardware plus software threads)?
The same code on xeon Phi spurns the entire 240 threads (hardware and software).
Sorry for the taketive:) Roughly speaking, yes, in order to get better performance, MKL spurn the threads based on hardware resource and experience.
- -on Xeon machine, use hardware core number by default. In your xeon machine, it is 6. ( And the 12 software threads, we call it Hyper-Threading thread in Xeon processor)
- on Xeon phi machine, use Hardware + software threads. It is 60x4 = 240, ( total 61 core , and 1 core was reserved)
The reason on Xeon machine is that Hyper-Threading Technology (HT Technology) On Xeon is only effective when each thread is performing different types of operations and when there are under-utilized resources on the processor. The threads in Intel MKL do exact same operation, so it can't benefit from HT thread. As a result, MKL fork 6 threads instead 12.
Same reason, but the HT technology on Xeon phi was implemented in different way than Xeon, it require at least 3 or 4 to feed all computing resources. in order to get better performance on Xeon Phi, MKL fork 240 thread be default.
Please see more in https://software.intel.com/en-us/forums/topic/294954 for Xeon and you can change the thread number by MKL_DYNAMIC and MKL_SET_NUM_THREADS setting.