Question

there is a weird behavior from mkl on our cluster. I am calling Eigen::SelfAdjointEigenSolverEigen::MatrixXcd for a complex hermitian matrix (ZHEEV).

When I calculate the eigenvectors for large matrices (dim >~ 100k) it only uses a single core.

Strangely, it runs perfectly fine (multiple cores) for smaller complex matrices, real matrices and large complex matrices (dim >~ 100k) without eigenvectors.

Did anyone face the same issue or has any idea what is going on in the background?

I tried various mkl versions.

Gennady_F_Intel · Answer 1

1,617 Views

it is strange. How did you check that the single-core is only used?

Copy link

taiyler · Answer 2

1,612 Views

I simply monitored the process via htop

Copy link

Gennady_F_Intel · Answer 3

1,598 Views

Could you set the MKL_VERBOSE environment and show the log?

Copy link

Gennady_F_Intel · Answer 4

as I see no feedback, I built the zheev mkl example and run the code with different problem sizes when MKL_VERBOSE is ON. See below the logs which were captured when zheev were run with 2K, 20K and 10K problem sizes.

Regarding to 100K problem, I didn't wait when the run will be completed but in any case, I see mkl returns 80 OpenMP threads has been used while this run.

$ ./a.out 2000

MKL_VERBOSE Intel(R) MKL 2020.0 Update 4 Product build 20200917 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of Intel(R) Deep Learning Boost (Intel(R) DL Boost), EVEX-encoded AES and Carry-Less Multiplication Quadword instructions, Lnx 2.30GHz ilp64 intel_thread

MKL_VERBOSE ZHEEV(V,L,2000,0x147bfd00c080,2000,0x3566f00,0x7ffd2acabf80,-1,0x357bf40,0) 248.69ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:80

MKL_VERBOSE ZHEEV(V,L,2000,0x147bfd00c080,2000,0x3566f00,0x147c02277080,34000,0x357bf40,0) 751.07ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:80

...zheev passed ...

$ ./a.out 20000

MKL_VERBOSE Intel(R) MKL 2020.0 Update 4 Product build 20200917 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of Intel(R) Deep Learning Boost (Intel(R) DL Boost), EVEX-encoded AES and Carry-Less Multiplication Quadword instructions, Lnx 2.30GHz ilp64 intel_thread

MKL_VERBOSE ZHEEV(V,L,20000,0x14832434b080,20000,0x1484a31df080,0x7ffd02392280,-1,0x1484a3169080,0) 195.66ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:80

MKL_VERBOSE ZHEEV(V,L,20000,0x14832434b080,20000,0x1484a31df080,0x148323713080,340000,0x1484a3169080,0) 290.12s CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:80

...zheev passed ...

$ ./ilp64.x 100000

MKL_VERBOSE Intel(R) MKL 2020.0 Update 4 Product build 20200917 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of Intel(R) Deep Learning Boost (Intel(R) DL Boost), EVEX-encoded AES and Carry-Less Multiplication Quadword instructions, Lnx 2.30GHz ilp64 intel_thread

MKL_VERBOSE ZHEEV(V,L,100000,0x143b952bd080,100000,0x1460d775f080,0x7ffc14d94f00,-1,0x1460d5ea2080,0) 110.63ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:80

srun: Force Terminated job 24982

This issue has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.

Eigen + MKL uses single core for complex matrix (ZHEEV)