Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

MKL Lapack parallel subroutine


Hello all

I am using the lapack subroutine 'dgelsd' in order to calculate the linear least square solution of (||Ax-b||) system. For that I have used Intel MKL Parallel library. When I run my code I can see that only 57% of the total CPU is used. Also setting the number of threads for MKL also has no effect. For that I used

call mkl_set_num_threads( 32 )

I am working on the workstation, whose specs are given below:

Intel(R) Xeon(R) CPU E5-2620 v4@ 2.10 GHz, Cores = 16, Logical processors = 32, Windows 10 Pro, 64-bit Operating system, x64-based processor.

Please suggest me how i can make use of available processing capacity. Presently my code is taking so much time to give results and its main computational part is calling DGELSD (where it is spending most of its time to give least square solution).


0 Kudos
1 Reply

MKL parallel lib is not parallel from the beginning to end, some parts is still sequential. Please use VTune to analyze your program and the source code to find out the hotspots.

mkl_set_num_thread() should be set number of cores, please try mkl_set_num_thead(16) to see if there are improvements.

BTW, please also verify whether hyperthreading is set tobe on or not.