Moreover, I use:

Chaowen_G_ · ‎06-27-2015

Hi:

I need to use cluster pardiso to solve a big double precision complex symmetric matrix. I set the iparm(40)=0, that means provide the matrix in usual centralized input format: the master MPI process stores all data from matrix A, with rank=0.

First test:

I use two compute nodes(each has 48gb ram). In the phase 33, solve and iterative refinement step, the master node (rank=0) uses 95% of the 48gb ram, the slave node (rank!=0) uses 73% of the 48gb ram.

Second test:

I use four compute nodes to deal with the same matrix. In the phase 33, the master node uses 93% of the 48gb ram, each of the rest three nodes uses 70% of the 48gb ram.

Well, I already double the total ram, but the consumption of each individual node does not reduce much. Because my real matrix is much bigger than the test matrix, it always says out of physical memory. So I want to ask how to reduce memory consumption in each compute node?

Detail of the test matrix:

< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           40782
             number of non-zeros in A:      7783421
             number of non-zeros in A (%): 0.467987

number of right-hand sides: 18653

< Factors L and U >
             number of columns for each panel: 64
             number of independent subgraphs: 0
             number of supernodes:                    1152
             size of largest supernode:               19339
             number of non-zeros in L:                456143822
             number of non-zeros in U:                1
             number of non-zeros in L+U:              456143823
             gflop   for the numerical factorization: 25510.260441

I use mkl 11.2.3, mvapich2.0b, gcc 5.1, Intel(R) Xeon(R) CPU X5650 @ 2.67GHz, InfiniBand: Mellanox Technologies MT25204 and linux86_64

Chaowen_G_ · ‎06-27-2015

Moreover, I use:

int provided;

MPI_Init_thread(nullptr,nullptr,MPI_THREAD_FUNNELED,&provided);

to initialize MPI

how to effectively reduce memory consumption of each compute node in cluster pardiso