Anamolous behavior of Intel MKL.

Intel® oneAPI Math Kernel Library

Ask questions and share information with other developers who use Intel® Math Kernel Library.

Anamolous behavior of Intel MKL.

1,097 Views

I am using Intel MKL routine zgemm() to multiply two complex matrices on a 2-core processor machine with a clock speed of 2.79 GHz

When I run the program with no OMP_NUM_THREADS and KMP_AFFINITY not set, I am getting approximately 2700 MLFOPS. When I set OMP_NUM_THREADS=2 and set KMP_AFFINITY= (null), my program's FLOPS go down to 1390 MFLOPS. When unset KMP_AFFINITY FLOP rate goes down even further to 1000 MFLOPS.

Why is the single thread code running better than when I specify two threads?

TIA

Link Copied

0 Replies

Community support is provided Monday to Friday. Other contact methods are available here.

Intel does not verify all solutions, including but not limited to any file transfers that may appear in this community. Accordingly, Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

For more complete information about compiler optimizations, see our Optimization Notice.