littlewang

Beginner

04-21-2021
06:26 PM

103 Views

pardiso thread

Threading Problem in Pardiso

hello,

We meet a problem in using Pardiso with multiple threads.This is the first time I used the MKL library. I want to set the number of parallel threads in pardiso. How to modify the following example program to get four threads in parallel?

I'm trying to use

Call mkl_set_dynamic(1)

Call mkl_set_num_threads (4)

Call omp_set_num_threads (4)

To set the number of threads, but the output is

=== PARDISO: solving a symmetric indefinite system ===

1-based array indexing is turned ON

PARDISO double precision computation is turned ON

METIS algorithm at reorder step is turned ON

Scaling is turned ON

Summary: ( reordering phase )

================

Times:

======

Time spent in calculations of symmetric matrix portrait (fulladj): 0.000010 s

Time spent in reordering of the initial matrix (reorder) : 0.000102 s

Time spent in symbolic factorization (symbfct) : 0.000015 s

Time spent in data preparations for factorization (parlist) : 0.000005 s

Time spent in allocation of internal data structures (malloc) : 0.001037 s

Time spent in additional calculations : 0.000012 s

Total time spent : 0.001181 s

Statistics:

===========

Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >

number of equations: 8

number of non-zeros in A: 18

number of non-zeros in A (%): 28.125000

number of right-hand sides: 1

< Factors L and U >

number of columns for each panel: 128

number of independent subgraphs: 0

< Preprocessing with state of the art partitioning metis>

number of supernodes: 4

size of largest supernode: 4

number of non-zeros in L: 31

number of non-zeros in U: 1

number of non-zeros in L+U: 32

Reordering completed ...

Number of nonzeros in factors = 32

Number of factorization MFLOPS = 0

=== PARDISO is running in In-Core mode, because iparam(60)=0 ===

Percentage of computed non-zeros for LL^T factorization

25 % 38 % 48 % 100 %

=== PARDISO: solving a symmetric indefinite system ===

Single-level factorization algorithm is turned ON

Summary: ( factorization phase )

================

Times:

======

Time spent in copying matrix to internal data structure (A to LU): 0.000000 s

Time spent in factorization step (numfct) : 0.000109 s

Time spent in allocation of internal data structures (malloc) : 0.000014 s

Time spent in additional calculations : 0.000001 s

Total time spent : 0.000124 s

Statistics:

===========

Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >

number of equations: 8

number of non-zeros in A: 18

number of non-zeros in A (%): 28.125000

number of right-hand sides: 1

< Factors L and U >

number of columns for each panel: 128

number of independent subgraphs: 0

< Preprocessing with state of the art partitioning metis>

number of supernodes: 4

size of largest supernode: 4

number of non-zeros in L: 31

number of non-zeros in U: 1

number of non-zeros in L+U: 32

gflop for the numerical factorization: 0.000000

gflop/s for the numerical factorization: 0.000679

Factorization completed ...

=== PARDISO: solving a symmetric indefinite system ===

Summary: ( solution phase )

================

Times:

======

Time spent in direct solver at solve step (solve) : 0.000025 s

Time spent in additional calculations : 0.000041 s

Total time spent : 0.000066 s

Statistics:

===========

Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >

number of equations: 8

number of non-zeros in A: 18

number of non-zeros in A (%): 28.125000

number of right-hand sides: 1

< Factors L and U >

number of columns for each panel: 128

number of independent subgraphs: 0

< Preprocessing with state of the art partitioning metis>

number of supernodes: 4

size of largest supernode: 4

number of non-zeros in L: 31

number of non-zeros in U: 1

number of non-zeros in L+U: 32

gflop for the numerical factorization: 0.000000

gflop/s for the numerical factorization: 0.000679

Solve completed ...

The solution of the system is

x( 1 ) = -4.186020128680938E-002

x( 2 ) = -3.413124159279142E-003

x( 3 ) = 0.117250376805018

x( 4 ) = -0.112639579923180

x( 5 ) = 2.417224446137142E-002

x( 6 ) = -0.107633340356223

x( 7 ) = 0.198719673273585

x( 8 ) = 0.190382963551205

I'd appreciate any input and ideas!

1 Solution

MRajesh_intel

Moderator

04-22-2021
04:11 AM

61 Views

Hi,

You can set the no. of threads in two ways:

1)Set command: MKL_DYNAMIC=FALSE MKL_NUM_THREADS=4 OMP_NUM_THREADS=4 ./a.out during runtime.

2) Call

mkl_set_dynamic(0)

mkl_set_num_threads (4)

omp_set_num_threads (4) before parallel region in your code.

Could you print the omp_get_thread_num() in the parallel region to get confirmation about the no. of threads that are getting launched?

Regards

Rajesh.

Link Copied

3 Replies

littlewang

Beginner

04-22-2021
06:28 PM

46 Views

Hi Rajesh,

非常感谢您的建议。这是正确的。

我已经按照上述方法成功解决了这个问题！

此致，

小王

MRajesh_intel

Moderator

04-27-2021
03:39 AM

20 Views

Hi,

Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.

Any further interaction in this thread will be considered community only.

Have a Good day.

Regards

Rajesh

For more complete information about compiler optimizations, see our Optimization Notice.