Re:About MKL Pardiso Parallel

gaor · ‎09-06-2023

I want to specify the number of threads when solving a matrix directly using MKL Pardiso. What do I do? For example: how to set up the environment, or what parameters to modify? When I look at each item in the iparm array, I find an inconsistency between the MKL Pardiso documentation and the Pardiso 8.0 user manual. For example, iparm[2] in the MKL Pardiso documentation is reserved and set to 0, while in the Pardiso user manual, the parameter is set to the number of threads

Which one should I use as a reference document

VarshaS_Intel · ‎09-08-2023

Hi,

Thanks for posting in Intel Communities.

>>I want to specify the number of threads when solving a matrix directly using MKL Pardiso. What do I do?

Yes, you can control the parallel execution of the solver by explicitly setting the MKL_NUM_THREADS environment variable. You can set the number of threads by using two different methods either by setting the environment variable or by calling the function.

Please find the below link for more details on setting techniques for the number of threads.

https://www.intel.com/content/www/us/en/docs/onemkl/developer-guide-linux/2023-2/techniques-to-set-the-number-of-threads.html

>>Which one should I use as a reference document

Could you please follow the Intel Developer Reference for setting the iparm values. Please find the below link for all the settings:

https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2023-2/pardiso-iparm-parameter.html

Thanks & Regards,

Varsha

gaor · ‎09-12-2023

Hi Varsha,

My hardware information is as follows

CPU：11th Gen Intel(R) Core(TM) i7-1195G7 @ 2.90GHz 2.92 GHz

OS：Windows11

IDE：VS2022

MKL：2023.2

Here are some of my additions. I set them up like this

But the results still don't seem to be parallel, as follows

=== PARDISO: solving a real nonsymmetric system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
Parallel METIS algorithm at reorder step is turned ON
Scaling is turned ON

Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.026033 s
Time spent in reordering of the initial matrix (reorder) : 2.428069 s
Time spent in symbolic factorization (symbfct) : 0.194481 s
Time spent in data preparations for factorization (parlist) : 0.010933 s
Time spent in allocation of internal data structures (malloc) : 0.033393 s
Time spent in matching/scaling : 0.002012 s
Time spent in additional calculations : 0.058307 s
Total time spent : 2.753229 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
number of equations: 682712
number of non-zeros in A: 2329176
number of non-zeros in A (%): 0.000500

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 657617
size of largest supernode: 603
number of non-zeros in L: 17462996
number of non-zeros in U: 16270170
number of non-zeros in L+U: 33733166

Reordering completed ...
Number of nonzeros in factors = 33733166
Number of factorization MFLOPS = 58431=== PARDISO is running in In-Core mode, because iparam(60)=0 ===

Percentage of computed non-zeros for LL^T factorization
1 %
2 %
3 %
4 %
5 %
6 %
7 %
8 %
9 %
10 %
11 %
12 %
13 %
14 %
15 %
16 %
17 %
18 %
19 %
20 %
21 %
22 %
23 %
24 %
25 %
26 %
27 %
28 %
29 %
30 %
31 %
32 %
33 %
34 %
35 %
36 %
37 %
38 %
39 %
40 %
41 %
42 %
43 %
44 %
45 %
46 %
47 %
48 %
49 %
50 %
51 %
52 %
53 %
54 %
55 %
56 %
57 %
58 %
59 %
60 %
61 %
62 %
63 %
64 %
65 %
66 %
67 %
68 %
69 %
70 %
71 %
72 %
73 %
74 %
75 %
76 %
77 %
78 %
79 %
80 %
81 %
83 %
85 %
86 %
87 %
88 %
90 %
91 %
92 %
93 %
94 %
95 %
96 %
97 %
98 %
99 %
100 %

=== PARDISO: solving a real nonsymmetric system ===
Single-level factorization algorithm is turned ON

Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct) : 6.424396 s
Time spent in allocation of internal data structures (malloc) : 0.000219 s
Time spent in additional calculations : 0.000002 s
Total time spent : 6.424617 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
number of equations: 682712
number of non-zeros in A: 2329176
number of non-zeros in A (%): 0.000500

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 657617
size of largest supernode: 603
number of non-zeros in L: 17462996
number of non-zeros in U: 16270170
number of non-zeros in L+U: 33733166
gflop for the numerical factorization: 58.431298

gflop/s for the numerical factorization: 9.095220

Factorization completed ...
=== PARDISO: solving a real nonsymmetric system ===

Summary: ( solution phase )
================

Times:
======
Time spent in direct solver at solve step (solve) : 0.028705 s
Time spent in additional calculations : 0.077641 s
Total time spent : 0.106346 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
number of equations: 682712
number of non-zeros in A: 2329176
number of non-zeros in A (%): 0.000500

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 657617
size of largest supernode: 603
number of non-zeros in L: 17462996
number of non-zeros in U: 16270170
number of non-zeros in L+U: 33733166
gflop for the numerical factorization: 58.431298

gflop/s for the numerical factorization: 9.095220

I have been troubled by this problem for a long time. I will be very happy if it can be resolved smoothly. Looking forward to your reply

Thanks & Regards,

gaor

gaor · ‎09-11-2023

Hi Varsha,

Thank you for your reply

I tried a lot of Settings and it still didn't work, always 1open MP in the console window print.

The platform I use is windows11, which is called by C language, and the IDE is VS2022. Can you give me some more suggestions?

Thanks & Regards,

VarshaS_Intel · ‎09-12-2023

Hi,

Thanks for your reply and providing the details.

Could you please provide us with the complete project file you are using which helps us to understand the settings and also to reproduce your issue at our end?

Thanks & Regards,

Varsha

gaor · ‎09-12-2023

Hi,

I am very glad to receive your reply. Attached is my VS2022 project. Please help me check if there is a problem with my Settings.

Thanks & Regards,

gaor

VarshaS_Intel · ‎09-20-2023

Hi,

Thanks for your reply.

>>But the results still don't seem to be parallel,

Could you please let us know why the results do not seem to be parallel?

Also, we tried using sequential mode the time taken is more when compared to parallel.

Could you please let us know the expected results you want to get from the code?

Thanks & Regards,

Varsha

VarshaS_Intel · ‎09-26-2023

Hi,

We have not heard back from you. Could you please provide us with an update on your issue?

Thanks & Regards,

Varsha

VarshaS_Intel · ‎10-04-2023

Hi,

We have not heard back from you. Could you please provide us with an update on your issue?

Thanks & Regards,

Varsha

About MKL Pardiso Parallel

Code Samples