Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6981 Discussions

Question about Parallel thread number of Pardiso

Tianxiong_Lu
Beginner
896 Views

Hi all,

In my program, I am not set restriction of OpenMP thread number, but I found it only using 4 OpenMP threads on my machine. Why?

 
My CPU configuration is (reported by Intel VTune ):
Name: Intel(R) Core(TM) Processor 2xxx Series
Frequency: 2.2 GHz
Logical CPU Count: 8 
 
OS is windows 10.
 
Following is console output by pardiso and openmp while setting KMP_SETTINGS=1 and pardiso message level parameter to 1:
User settings:

   KMP_SETTINGS=1

Effective settings:

   KMP_ABORT_DELAY=0
   KMP_ABORT_IF_NO_IRML=false
   KMP_ADAPTIVE_LOCK_PROPS='1,1024'
   KMP_ALIGN_ALLOC=64
   KMP_ALL_THREADPRIVATE=128
   KMP_ALL_THREADS=32768
   KMP_ATOMIC_MODE=1
   KMP_BLOCKTIME=200
   KMP_CPUINFO_FILE: value is not defined
   KMP_DETERMINISTIC_REDUCTION=false
   KMP_DUPLICATE_LIB_OK=false
   KMP_FORCE_REDUCTION: value is not defined
   KMP_FOREIGN_THREADS_THREADPRIVATE=true
   KMP_FORKJOIN_BARRIER='2,2'
   KMP_FORKJOIN_BARRIER_PATTERN='hyper,hyper'
   KMP_FORKJOIN_FRAMES=true
   KMP_FORKJOIN_FRAMES_MODE=3
   KMP_GTID_MODE=2
   KMP_HANDLE_SIGNALS=false
   KMP_HOT_TEAMS_MAX_LEVEL=1
   KMP_HOT_TEAMS_MODE=0
   KMP_INIT_AT_FORK=true
   KMP_INIT_WAIT=2048
   KMP_ITT_PREPARE_DELAY=0
   KMP_LIBRARY=throughput
   KMP_LOCK_KIND=queuing
   KMP_MALLOC_POOL_INCR=1M
   KMP_MONITOR_STACKSIZE: value is not defined
   KMP_NEXT_WAIT=1024
   KMP_NUM_LOCKS_IN_BLOCK=1
   KMP_PLAIN_BARRIER='2,2'
   KMP_PLAIN_BARRIER_PATTERN='hyper,hyper'
   KMP_REDUCTION_BARRIER='1,1'
   KMP_REDUCTION_BARRIER_PATTERN='hyper,hyper'
   KMP_SCHEDULE='static,balanced;guided,iterative'
   KMP_SETTINGS=true
   KMP_STACKOFFSET=64
   KMP_STACKPAD=0
   KMP_STACKSIZE=4M
   KMP_STORAGE_MAP=false
   KMP_TASKING=2
   KMP_TASK_STEALING_CONSTRAINT=1
   KMP_USE_IRML=false
   KMP_VERSION=false
   KMP_WARNINGS=true
   OMP_CANCELLATION=false
   OMP_DISPLAY_ENV=false
   OMP_DYNAMIC=false
   OMP_MAX_ACTIVE_LEVELS=2147483647
   OMP_NESTED=false
   OMP_NUM_THREADS: value is not defined
   OMP_PLACES: value is not defined
   OMP_PROC_BIND='false'
   OMP_SCHEDULE='static'
   OMP_STACKSIZE=4M
   OMP_THREAD_LIMIT=32768
   OMP_WAIT_POLICY=PASSIVE
   KMP_AFFINITY='noverbose,warnings,respect,granularity=core,duplicates,none'


Mesh statistics:
     nodes     :  4419
     edges     :  13078
     triangles :  8660
Rotate mesh alpha = 0
Rotate mesh alpha = 8

=== PARDISO: solving a real nonsymmetric system ===
0-based array is turned ON
PARDISO double precision computation is turned ON
Parallel METIS algorithm at reorder step is turned ON
Scaling is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.000803 s
Time spent in reordering of the initial matrix (reorder)         : 0.009233 s
Time spent in symbolic factorization (symbfct)                   : 0.002040 s
Time spent in data preparations for factorization (parlist)      : 0.000264 s
Time spent in allocation of internal data structures (malloc)    : 0.006886 s
Time spent in additional calculations                            : 0.002166 s
Total time spent                                                 : 0.021393 s

Statistics:
===========
Parallel Direct Factorization is running on 4 OpenMP

< Linear system Ax = b >
             number of equations:           4754
             number of non-zeros in A:      32740
             number of non-zeros in A (%): 0.144864

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
             number of supernodes:                    3260
             size of largest supernode:               110
             number of non-zeros in L:                101444
             number of non-zeros in U:                75684
             number of non-zeros in L+U:              177128

Reordering completed ...
Number of nonzeros in factors = 177128
Number of factorization MFLOPS = 6=== PARDISO is running in In-Core mode, because iparam(60)=0 ===

=== PARDISO: solving a real nonsymmetric system ===
Two-level factorization algorithm is turned ON

 

 
 
 
 
0 Kudos
1 Solution
Gennady_F_Intel
Moderator
896 Views

MKL uses only 4 threads because of Hyper-Threading. See more details in "Using Hyper-Threading technology" chapter from into MKL user's Guide.

 

View solution in original post

0 Kudos
2 Replies
Gennady_F_Intel
Moderator
897 Views

MKL uses only 4 threads because of Hyper-Threading. See more details in "Using Hyper-Threading technology" chapter from into MKL user's Guide.

 

0 Kudos
Tianxiong_Lu
Beginner
896 Views

Thank you for your information.

Gennady Fedorov (Intel) wrote:

MKL uses only 4 threads because of Hyper-Threading. See more details in "Using Hyper-Threading technology" chapter from into MKL user's Guide.

 

0 Kudos
Reply