- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to specify the number of threads when solving a matrix directly using MKL Pardiso. What do I do? For example: how to set up the environment, or what parameters to modify? When I look at each item in the iparm array, I find an inconsistency between the MKL Pardiso documentation and the Pardiso 8.0 user manual. For example, iparm[2] in the MKL Pardiso documentation is reserved and set to 0, while in the Pardiso user manual, the parameter is set to the number of threads
Which one should I use as a reference document
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting in Intel Communities.
>>I want to specify the number of threads when solving a matrix directly using MKL Pardiso. What do I do?
Yes, you can control the parallel execution of the solver by explicitly setting the MKL_NUM_THREADS environment variable. You can set the number of threads by using two different methods either by setting the environment variable or by calling the function.
Please find the below link for more details on setting techniques for the number of threads.
>>Which one should I use as a reference document
Could you please follow the Intel Developer Reference for setting the iparm values. Please find the below link for all the settings:
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Varsha,
My hardware information is as follows
CPU:11th Gen Intel(R) Core(TM) i7-1195G7 @ 2.90GHz 2.92 GHz
OS:Windows11
IDE:VS2022
MKL:2023.2
Here are some of my additions. I set them up like this
But the results still don't seem to be parallel, as follows
=== PARDISO: solving a real nonsymmetric system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
Parallel METIS algorithm at reorder step is turned ON
Scaling is turned ON
Summary: ( reordering phase )
================
Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.026033 s
Time spent in reordering of the initial matrix (reorder) : 2.428069 s
Time spent in symbolic factorization (symbfct) : 0.194481 s
Time spent in data preparations for factorization (parlist) : 0.010933 s
Time spent in allocation of internal data structures (malloc) : 0.033393 s
Time spent in matching/scaling : 0.002012 s
Time spent in additional calculations : 0.058307 s
Total time spent : 2.753229 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b >
number of equations: 682712
number of non-zeros in A: 2329176
number of non-zeros in A (%): 0.000500
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 657617
size of largest supernode: 603
number of non-zeros in L: 17462996
number of non-zeros in U: 16270170
number of non-zeros in L+U: 33733166
Reordering completed ...
Number of nonzeros in factors = 33733166
Number of factorization MFLOPS = 58431=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
1 %
2 %
3 %
4 %
5 %
6 %
7 %
8 %
9 %
10 %
11 %
12 %
13 %
14 %
15 %
16 %
17 %
18 %
19 %
20 %
21 %
22 %
23 %
24 %
25 %
26 %
27 %
28 %
29 %
30 %
31 %
32 %
33 %
34 %
35 %
36 %
37 %
38 %
39 %
40 %
41 %
42 %
43 %
44 %
45 %
46 %
47 %
48 %
49 %
50 %
51 %
52 %
53 %
54 %
55 %
56 %
57 %
58 %
59 %
60 %
61 %
62 %
63 %
64 %
65 %
66 %
67 %
68 %
69 %
70 %
71 %
72 %
73 %
74 %
75 %
76 %
77 %
78 %
79 %
80 %
81 %
83 %
85 %
86 %
87 %
88 %
90 %
91 %
92 %
93 %
94 %
95 %
96 %
97 %
98 %
99 %
100 %
=== PARDISO: solving a real nonsymmetric system ===
Single-level factorization algorithm is turned ON
Summary: ( factorization phase )
================
Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct) : 6.424396 s
Time spent in allocation of internal data structures (malloc) : 0.000219 s
Time spent in additional calculations : 0.000002 s
Total time spent : 6.424617 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b >
number of equations: 682712
number of non-zeros in A: 2329176
number of non-zeros in A (%): 0.000500
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 657617
size of largest supernode: 603
number of non-zeros in L: 17462996
number of non-zeros in U: 16270170
number of non-zeros in L+U: 33733166
gflop for the numerical factorization: 58.431298
gflop/s for the numerical factorization: 9.095220
Factorization completed ...
=== PARDISO: solving a real nonsymmetric system ===
Summary: ( solution phase )
================
Times:
======
Time spent in direct solver at solve step (solve) : 0.028705 s
Time spent in additional calculations : 0.077641 s
Total time spent : 0.106346 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b >
number of equations: 682712
number of non-zeros in A: 2329176
number of non-zeros in A (%): 0.000500
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 657617
size of largest supernode: 603
number of non-zeros in L: 17462996
number of non-zeros in U: 16270170
number of non-zeros in L+U: 33733166
gflop for the numerical factorization: 58.431298
gflop/s for the numerical factorization: 9.095220
I have been troubled by this problem for a long time. I will be very happy if it can be resolved smoothly. Looking forward to your reply
Thanks & Regards,
gaor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Varsha,
Thank you for your reply
I tried a lot of Settings and it still didn't work, always 1open MP in the console window print.
The platform I use is windows11, which is called by C language, and the IDE is VS2022. Can you give me some more suggestions?
Thanks & Regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for your reply and providing the details.
Could you please provide us with the complete project file you are using which helps us to understand the settings and also to reproduce your issue at our end?
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am very glad to receive your reply. Attached is my VS2022 project. Please help me check if there is a problem with my Settings.
Thanks & Regards,
gaor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for your reply.
>>But the results still don't seem to be parallel,
Could you please let us know why the results do not seem to be parallel?
Also, we tried using sequential mode the time taken is more when compared to parallel.
Could you please let us know the expected results you want to get from the code?
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. Could you please provide us with an update on your issue?
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. Could you please provide us with an update on your issue?
Thanks & Regards,
Varsha
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page