Solved: Re:pardiso thread

littlewang · ‎04-21-2021

Threading Problem in Pardiso

hello,

We meet a problem in using Pardiso with multiple threads.This is the first time I used the MKL library. I want to set the number of parallel threads in pardiso. How to modify the following example program to get four threads in parallel?

I'm trying to use

Call mkl_set_dynamic(1)
Call mkl_set_num_threads (4)
Call omp_set_num_threads (4)
To set the number of threads, but the output is

=== PARDISO: solving a symmetric indefinite system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Scaling is turned ON

Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.000010 s
Time spent in reordering of the initial matrix (reorder) : 0.000102 s
Time spent in symbolic factorization (symbfct) : 0.000015 s
Time spent in data preparations for factorization (parlist) : 0.000005 s
Time spent in allocation of internal data structures (malloc) : 0.001037 s
Time spent in additional calculations : 0.000012 s
Total time spent : 0.001181 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
number of equations: 8
number of non-zeros in A: 18
number of non-zeros in A (%): 28.125000

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 4
size of largest supernode: 4
number of non-zeros in L: 31
number of non-zeros in U: 1
number of non-zeros in L+U: 32
Reordering completed ...
Number of nonzeros in factors = 32
Number of factorization MFLOPS = 0
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===

Percentage of computed non-zeros for LL^T factorization
25 % 38 % 48 % 100 %

=== PARDISO: solving a symmetric indefinite system ===
Single-level factorization algorithm is turned ON

Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct) : 0.000109 s
Time spent in allocation of internal data structures (malloc) : 0.000014 s
Time spent in additional calculations : 0.000001 s
Total time spent : 0.000124 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
number of equations: 8
number of non-zeros in A: 18
number of non-zeros in A (%): 28.125000

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 4
size of largest supernode: 4
number of non-zeros in L: 31
number of non-zeros in U: 1
number of non-zeros in L+U: 32
gflop for the numerical factorization: 0.000000

gflop/s for the numerical factorization: 0.000679

Factorization completed ...

=== PARDISO: solving a symmetric indefinite system ===

Summary: ( solution phase )
================

Times:
======
Time spent in direct solver at solve step (solve) : 0.000025 s
Time spent in additional calculations : 0.000041 s
Total time spent : 0.000066 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
number of equations: 8
number of non-zeros in A: 18
number of non-zeros in A (%): 28.125000

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 4
size of largest supernode: 4
number of non-zeros in L: 31
number of non-zeros in U: 1
number of non-zeros in L+U: 32
gflop for the numerical factorization: 0.000000

gflop/s for the numerical factorization: 0.000679

Solve completed ...
The solution of the system is
x( 1 ) = -4.186020128680938E-002
x( 2 ) = -3.413124159279142E-003
x( 3 ) = 0.117250376805018
x( 4 ) = -0.112639579923180
x( 5 ) = 2.417224446137142E-002
x( 6 ) = -0.107633340356223
x( 7 ) = 0.198719673273585
x( 8 ) = 0.190382963551205

I'd appreciate any input and ideas!

MRajesh_intel · ‎04-22-2021

Hi,

You can set the no. of threads in two ways:

1)Set command: MKL_DYNAMIC=FALSE MKL_NUM_THREADS=4 OMP_NUM_THREADS=4 ./a.out during runtime.

2) Call

mkl_set_dynamic(0)

mkl_set_num_threads (4)

omp_set_num_threads (4) before parallel region in your code.

Could you print the omp_get_thread_num() in the parallel region to get confirmation about the no. of threads that are getting launched?

Regards

Rajesh.

View solution in original post

MRajesh_intel · ‎04-22-2021

Hi,

You can set the no. of threads in two ways:

1)Set command: MKL_DYNAMIC=FALSE MKL_NUM_THREADS=4 OMP_NUM_THREADS=4 ./a.out during runtime.

2) Call

mkl_set_dynamic(0)

mkl_set_num_threads (4)

omp_set_num_threads (4) before parallel region in your code.

Could you print the omp_get_thread_num() in the parallel region to get confirmation about the no. of threads that are getting launched?

Regards

Rajesh.

littlewang · ‎04-22-2021

Hi Rajesh,

非常感谢您的建议。这是正确的。

我已经按照上述方法成功解决了这个问题！

此致，

小王

MRajesh_intel · ‎04-27-2021

Hi,

Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.

Any further interaction in this thread will be considered community only.

Have a Good day.

Regards

Rajesh