The matrix has about 200,000 equations with about 8 million nonzeros. It's a symmetric indefinite matrix. Right before the first call to Pardiso I print
OMP_NUM_THREADS= 2 MKL_NUM_THREADS= 2
Here's all the printout:
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
================ PARDISO: solving a symmetric indef. system ================ The local (internal) PARDISO version is : 103000115 1-based array indexing is turned ON PARDISO double precision computation is turned ON METIS algorithm at reorder step is turned ON Single-level factorization algorithm is turned ON Scaling is turned ON
Summary PARDISO: ( reorder to reorder ) ================
Times: ====== Time spent in calculations of symmetric matrix portrait(fulladj): 0.233544 s Time spent in reordering of the initial matrix(reorder) : 3.419899 s Time spent in symbolic factorization(symbfct) : 0.900987 s Time spent in allocation of internal data structures(malloc) : 0.139583 s Time spent in additional calculations : 1.692345 s Total time spent : 6.386359 s
Statistics: =========== < Parallel Direct Factorization with #processors: > 1 < Hybrid Solver PARDISO with CGS/CG Iteration >
< Linear system Ax = b> #equations: 219057 #non-zeros in A: 7798701 non-zeros in A (): 0.016252
#right-hand sides: 0
< Factors L and U > #columns for each panel: 128 #independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> #supernodes: 29451 size of largest supernode: 3438 number of nonzeros in L 77824417 number of nonzeros in U 1 number of nonzeros in L+U 77824418
Just in case, if you mean you have hyperthreading enabled, remember that MKL tries to maximize performance by using just 1 thread per pair of hyperthread logical processors, unless you over-ride by setting MKL_DYNAMIC. The term physical processor is more likely to refer to a complete core, which would support a pair of logical processors when hyperthreading is enabled.
I'll be very honest: your answer blew me away. I'm new to OMP, so I had never heard of either OMP_DYNAMIC or MKL_DYNAMIC before.
To the best of my knowledge, my machine has 2 processors, and I assume each has a single core. Each processor is a Intel Xeon, and Dell describes them as "C8508 Processor, 80546K, 3.0G, 2M, XNI 800, N0", where C8508 is the Dell part number (probably not too useful for you).
Given that, I tried mkl_set_dynamic(0) and mkl_set_dynamic(1). It made no difference. In both cases, during the matrix factorization, I only see one processor at work (task manager showing 50% utilization).
1) Should I see any difference between the two mkl_set_dynamic calls?
2) Is the fact that I see only 50% utilization of the CPU with the task manager a true indication that only one CPU is being used? I always believed this is the case, but maybe I don't have all the facts.
3) Is there a way to *force* MKL to use 2 processors, even if it believes it's better off with only 1? All I want to see is that everything is being done correctly. Once I know that's the case then I'll let MKL make its own smarter decisions.
Apparently, it's an "Irwindale" single core HyperThread CPU. These were probably available in both dual and single CPU platforms. Typically, floating point performance of the dual CPU platform was reduced by 15% when HyperThread was left enabled, even on linux (worse on Windows, not so bad on single CPU). You can check your BIOS setup screen to see whether HyperThreading is enabled. If enabled, and you see just 2 processors in task manager, there's only 1 CPU, and running 1 thread would show 50% on task manager, even though you get more performance than you would with 2 threads.
OpenMP dynamic is a different facility from MKL dynamic. I think I've confused you about MKL_DYNAMIC. See this earlier post specifically about how to get MKL to use all the HyperThreads by setting MKL_DYNAMIC=FALSE and specifying MKL_NUM_THREADS.
MKL is able to run in parallel across multiple processors. MKL sets a number of threads equal to a number of physical cores available totally in your system. In your case, the number of physical cores is equal to 1 on Windows:
Number of processors = 2 // Number of logical processors is 2 Multi-core capable = NO // No multi-core, it means 1 physical cores Hyperthreading capable = YES // Hyperthreading, it means 2 logical processors per 1 physical core
To make sure, you can run 'systeminfo' command under 'cmd' and report the info about the processor. I just tried to run PARDISO on my dual-core laptop (physically dual-core) and it reported 2 threads.
As far as I'm concerned about you example (a 4-processor, each single core) - did you mean 4-socket system? I'm sure that MKL will use 4 threads in this case, as 4 cores will be available.
System Manufacturer: Dell Inc. System Model: Precision WorkStation 670 System Type: x64-based PC Processor(s): 2 Processor(s) Installed. : EM64T Family 15 Model 4 Stepping 3 GenuineIntel ~2993 Mhz : EM64T Family 15 Model 4 Stepping 3 GenuineIntel ~2993 Mhz
I guess I'm still confused. Is there any combination of parameters/environment variables that I can set on this machine that will show #processors = 2? Or is this misleading and it is really using both processors?
Ok, it seems this way to determine precise information about the system is not the best. In fact, I checked that systeminfo reports logical processors (so, It will give the same information for 2 single-core processors, 1 dual-core and, for instance, a single-core with hyperthreading ON). However, I checked MKL on the system of 2 single-core processors (with rather old Nocona processor) and MKL PARDISO reported 2 threads.
Let's try another effort to obtain precise info about your system: can you install free CPU-Z tool available here?