Hello,
In my application, I use the pardiso_64 function to solve a sparse matrix.
I have a problem where the CPU usage is always 100% regardless of the value set for the number of threads.
Before the pardiso function was called, mkl_set_num_threads( n ); was called to set the number of threads to 'n'. I monitored the CPU usage using the task manager in Windows and found that the usage is 100% even when n is 1.
I also used the environment variable "MKL_NUM_THREADS" for comparison.
MKL_NUM_THREADS / n / cpu usage
not set / 1 / 100%
1 / 1 / 20%
1 / 8 / 100%
8 / 1 / 100%
8 / 8 / 100%
As far as I know, regardless of the value of MKL_NUM_THREADS, the number of threads is set to the value in the mkl_set_num_threads function. But, this looks like using the larger of the two values as the number of threads.
Am I missing something?
链接已复制
Hi Kwangog,
When you call mkl’s routine and environment variable both at the same time, then routine has larger priority vs environment variables.
Wrt CPU usage: you might initialize message level parameter ( msglvl == 1 ) which is switch off by the default
and check how many threads has been used in the real case with the real workloads.
This is the only way to choose how MKL Pardiso uses available CPU resources.
Example:
Calling MKL Pardiso with 1 thread – we could see as follows:
...
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
And /Or when we run with the default number of OpenMP threads ( Sapphire Rapids CPU, 112 physical cores )
....
Statistics:
===========
Parallel Direct Factorization is running on 112 OpenMP
The full logs I added to this treads when running the 503712 x 503712 input with 18660027 nnz.
--Gennady
forget to add logs when running inline.mtx workloads with 1 and 112 threads.
Hello Gennady,
Thank you for your reply.
I set msglvl to 1 and tried running my program.
This is part of my code:
...
...
And I have attached the result printed.
The value of "nthread_smp" printed on cout is 1.
However, the number of OpenMP threads output by the pardiso function seems to be 6.
Is there something wrong?
If needed, I'll create reproducible code.
Hello Gennady,
An uninitialized value was used for "nrhs".
This probably wasn't an issue during the reordering phase, as the value would not have been used.
However, I still have the same problem even when I set the correct value.
I've also tested it on Linux and didn't get any problems.
This seems to be an issue with Windows.
Hello Gennady,
I'm using version 2023.1 (included in oneAPI 2023.1).
However, I was using the version shared from internal repository, not the one I installed from Intel.
To check if it was an installation issue, I installed and linked to OneAPI 2023.1 directly.
Fortunately, the issue was resolved. It was probably a internal package problem.
Thank you.
