Pardiso Multiple right hand sides.

Mistry__Mital · ‎06-28-2010

Dear All,

I am trying to solve AX = B for X, using Pardiso and X and B are matrices instead of vectors. So I am solving using Multiple right hand side for the smaller cases. But now I am solving the same problem with increased size of A and B.

The case I am running is having size of A (1008777 * 1008777) and of B ( 1008777 * 9000). That means I have 9000 right hand sides. I am facing all memory issues. Do any one has any suggestion how to deal with this problem. As I have successfully worked with a problem which has size of A (60000 *60000) and of B (60000 * 1800) by breaking right hand side into segments.

I can Also work with my current problem by taking one right hand side at a time. But it is very time consuming task and it took around 20 hrs for solving for one right hand side.

Note: my matrix A is symmetric.

Any idea for allocating or assigning memories to matrices?

Thank you,

Regards,
Mital

**** Also when I so one righe hand side and allocate the memory using

ALLOCATE(X1(1008777,500))...

My program do not proceeds and it gives following output.

**********************
PARDISO license check was successful ...
Reordering completed ...
Number of nonzeros in factors = 0
Number of factorization MFLOPS = 0
Factorization completed ...
********************************

Konstantin_A_Intel · ‎06-28-2010

Hello Mital,

Which version of PARDISO do you use? MKL or from University of Basel?

In general, I would suggest you thatallocating data for all your right-hand sides (RHS) is not possible on most of machines:1008777 * 9000 ~ 9E+9 elements > 50 Gb of memory. And I hope that you use 64-bit OS as 32-bit systems are completely inappropriate here with their 3-4Gb memory limitation.

So, you're definetely on the right way when try to split X and B matrices to smaller parts. I would suggest you do as follows:

Perform phase=11 (Reordering and symbolic fact.)

Perform phase=22 (Numeric factorization)

nb=50; // It would be a number of RHS vectors to solve at once. You may vary this: 30, 100 etc., but I suspect that 500 is may be too high.

allocate X and B of size (1008777, nb)

for(i=1; i<9000; i=i+nb) {

read RHS from i to (i+nb-1) to B

Perform solving for this nb vectors

write solution X (for vectors from i to (i+nb-1)) to disk

}

I hope this should work fine. An it's strange that you mentioned that solving each RHS takes ~20hrs as far as usually factorization may take so much time, but we need to perform factorization only once for all RHS.

Mistry__Mital · ‎08-22-2010

Hi Konstantin,

Thanks for this explaination, I was working on some other problem and now I am back to this issue. I will try the way you have suggested.

I am using 64 bit machine as I am running cases on the Universityn cluster that has high computation power. I mean from a memory point of view. We have machines with 4GB per processor and 2 processors per node. We have more then 150 nodes available.

Can you suggest me how can I make best use of this available resources. Can I use more then one processors or multiple nodes for solving this problem. I can give you further information of the resources if you think its fruitful..

Thank you again....

Regards,

Mital

Mistry__Mital · ‎08-22-2010

Dear Konstantin,

If I use loop inside the program for taking set for calculation... It gives me following message.

%*************************************************************

ifort: command line warning #10156: ignoring option '-i'; no argument required
DONE0
PARDISO license check was successful ...
DONE2
Reordering completed ...
Number of nonzeros in factors = 0
Number of factorization MFLOPS = 0
Factorization completed ...
forrtl: severe (71): integer divide by zero
Image PC Routine Line Source
a.out 000000000041638D Unknown Unknown Unknown
a.out 00000000004155FC Unknown Unknown Unknown
libc.so.6 00002AAAAB2C1994 Unknown Unknown Unknown
a.out 0000000000415509 Unknown Unknown Unknown

%***************************************************************

I am using MKL 10.2 version.

I get this error many times, I don't know the reason for this....

I have attached the Code file and the job script that I am using for running the code...

If I solve all 300 RHS at a time then I am able to solve it but when I divide RHS in the set of 3
I get this error. This the example set I am running before I jump to a big problem.

Mital

Konstantin_A_Intel · ‎08-22-2010

Hello Mital,

From the code.txt I can conclude that the interface of PARDISO you use differs from MKL PARDISO and looks similar to PARDISO from University of Basel (for example, DPARM array is not presented in MKL PARDISO).

Please, first of all, make sure you use MKL PARDISO interface correctly: look atexamples/solver/source/pardiso_sym_f.f for reference.

Also, please print value of IPARM(64) after reordering stage. In MKL PARDISO it should contain MKL version.

Thank you,

Konstantin

Mistry__Mital · ‎08-22-2010

Thanks Konstantin,

I am using MKL PARDISO, As it is in the mkl folder where it is stored...

Here is the Path where I am using this solver.

********************************************
export OMP_NUM_THREADS=1
export MKLPATH=/cvos/shared/apps/intel/mkl/10.2.1.017/lib/em64t
export MKLINCLUDE=/cvos/shared/apps/intel/mkl/10.2.1.017/include
cd $PBS_O_WORKDIR
ifort -mcmodel=large -i-shared-heap-arrays[10] -m64 SPARSE_SOLVER.f90 -o b.out -L$MKLPATH -I$MKLINCLUDE $MKLPATH/libmkl_solver_ilp64.a -Wl,--start-group $MKLPATH/l

**************************************************

This is the path I am using for calling mkl solver.

I will still check my code as per the example you have suggested.

But The code was running perfect when I was not using Loop for the multiple RHS.

Thanks again..
Regards,
Mital

Gennady_F_Intel · ‎08-22-2010

Mital,

your linking line is not complete. Plese look here how to link your application by the properly way.

--Gennady

Konstantin_A_Intel · ‎08-23-2010

Mital,

I reproduced your error. It turned out that NRHS was reset to 0 before calling solve phase.I have modified your example as follows and it worked fine:

1) DPARM parameter was removed completely (no DPARM in MKL PARDISO documentation).

2) Call to PARDISOINIT was removed and changed with correct initialization(no PARDISOINIT in MKL PARDISO documentation):

do i = 1, 64

iparm(i) = 0

end do

iparm(1) = 1 ! no solver default

iparm(2) = 2 ! fill-in reordering from METIS

iparm(8) = 1 ! numbers of iterative refinement steps

iparm(10) = 13 ! perturbe the pivot elements with 1E-13

iparm(18) = -1 ! Output: number of nonzeros in the factor LU

iparm(19) = -1 ! Output: Mflops for LU factorization

iparm(60) = 0 ! OOC version will be run

error = 0 ! initialize error flag

msglvl = 0 ! print statistical information

mtype = -2 !

maxfct = 1

mnum = 1

do i = 1, 64

pt(i) = 0

end do

3) About link line: you used ilp64 and sequential libraries. Was it done intentionally? You would gain much of performance if use lp64+thread libraries. And from performance point of view, as well, i would suggest you to use NRHS=16 or 32 instead of 3.

4) Your PARDISO call with phase=33 doesn't seem correct as well: you should pass B(1, l) instead of B (that's B(1, 1) and contains the same data during all iterations).

Best regards,

Konstantin

P.S. As Gennady suggested, try to use correct link line, e.g.:

-L$MKL_PATH/lib/em64t -lmkl_intel_ilp64 -Wl,--start-group -lmkl_intel_thread -lmkl_core -Wl,--end-group-liomp5 -lpthread -lm

Mistry__Mital · ‎08-23-2010

Thanks to you and Gennady,

I was not using link line intentionally, As I worked first time and it was working ok so I didn't looked into it for optimization. But the case is too big so I will go through as per your suggestions.

Mital

Mistry__Mital · ‎09-02-2010

Hi ALL,,

My previous problem of inverting matrix is solved but now I am having trouble in the multliplication of the matrix C = A*B where A is sparse matrix and B is dense matrix.

Size of A is 8520*600000 with around 230000 non zeros. and the size of B is 600000*10

********************************************************
I am using following commands.

call mkl_dcoomm(transa,mm,nnn,kk,alpha,matdescra,val,rowind,colind,nnz,b,ldb,beta,c,ldc)

with following parameters,

mm = 8520 ! number of rows of kms
nnn = 10 ! number of NRHS
kk = 600000
alpha = 1.0
matdescra(1) = 'G'
!matdescra(2) = ''
!matdescra(3) = ''
matdescra(4) = 'F'
val = vkms
rowind = jkms
colind = ikms
nnz = 230000
b = X
ldb = 600000
beta = 0
ldc = 8520

And following linking I am using.

export MKLPATH=/opt/intel/Compiler/11.1/072/mkl/lib/64
export MKLINCLUDE=/opt/intel/Compiler/11.1/072/mkl/include
cd $PBS_O_WORKDIR
-L$MKLPATH -I$MKLINCLUDE $MKLPATH/libmkl_solver_ilp64.a -Wl,--start-group $MKLPATH/libmkl_intel_ilp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a $MKLPATH/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -openmp -lpthread

where X comes from the inversion of the matrix...... So inversion is working perfectly.

On running the case I am having following error.

forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
. 400000000090ED90 Unknown Unknown Unknown
. 400000000002BDC0 Unknown Unknown Unknown
. 400000000000E460 Unknown Unknown Unknown
. 4000000000008CB0 Unknown Unknown Unknown
. 4000000000007DD0 Unknown Unknown Unknown
. 4000000000004ED0 Unknown Unknown Unknown
libc.so.6.1 2000000000587870 Unknown Unknown Unknown
. 4000000000004D00 Unknown Unknown Unknown

**********************************************

Thank you,,,

Mital

Sergey_P_Intel2 · ‎09-03-2010

Hi Mital,

There are several sparse matrix formats which are supported by MKL Sparse BLAS. Particularly,mkl_dcoomm function multiplies sparse matrix Ain the COORDINATE format. Which sparse format is used for representation of matrix A in your program?
Probably your matrix is stored in compressed sparse row (CSR) format like for PARDISO? In this case you should use mkl_dcsrmm function in your application.

Regards,
Sergey

Gennady_F_Intel · ‎09-03-2010

Hi Mital,

looking at the mkl path you are using:

export MKLPATH=/opt/intel/Compiler/11.1/072/mkl/lib/64

that's mean you want to work on IA64 Architecture ( aka Itanium, Itanium2).

In the case if you are working onIntel 64 Architecture

then change the mkl path as follow

export MKLPATH=/opt/intel/Compiler/11.1/072/mkl/lib/em64

and check this problem again.

--Gennady

Mistry__Mital · ‎09-03-2010

Thanks Sergey & Gennady,

@ Sergey-My matrix is in COORDINATE format,,, Am I missing any thing in linking? Because I also checked with sequential linking... But it gives the same error...

@ Gennady- I looked into the folder you said but it has only 64 option there is no em64 option in it....

Mital

Mistry__Mital · ‎09-03-2010

Thanks Sergey,

I am using COORDINATE FORMAT...

Gennady_F_Intel · ‎09-03-2010

it was just misprint: see into/opt/intel/Compiler/11.1/072/mkl/lib/em64t.

--Gennady

Mistry__Mital · ‎09-03-2010

MKL_DCOOMM_CODE.txt

Hi Gennady,

em64t is also not there,,, It has only 64 not even 32... Before I was using other cluster and it was having all of them but due to memory requirement I am using new cluster that supports PARDISO solver and is very fast too. But it has only 64 in it.

Even I tride with em64t on other cluster,, but it's giving the same error....

I have attached the code and linking....

It gives this error at very beginning... So i dn't think it's a memory issue.... Even I have no memory constraints on the machine...

Please help me with this....

Mital

Mistry__Mital · ‎09-04-2010

Dear Gennady,

I did solve the problem, and it was I was using following command for linking,

******************
$MKLPATH/libmkl_solver_ilp64.a -Wl,--start-group $MKLPATH/libmkl_intel_ilp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a $MKLPATH/libmkl_lapack95_ilp64.a -Wl,--end-group -openmp -lpthread

*******************

And Now I am using following command,

******************
$MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_sequential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

******************

So MKL_DCOOMM is working now,

But the problem is, PARDISO is not working with this command,, and I want to use both PARDISO and MKL_DCOOMM in the same program... So is there any solution....

Thanks,,,
Mital

Gennady_F_Intel · ‎09-05-2010

Mital,

we couldn't open this MKL_DCOOMM_CODE.txt. Please try to attach this code follow the rules we described intothis article.

--Gennady

Konstantin_A_Intel · ‎09-05-2010

Hi Mital,

I would suggest you to care more about size of integer data used in your program.

If you use usual integer data (LP64 or 32-bit integers, or integer*4 in Fortran), you must use lp64 interface of MKL.

If you use long integer data (ILP64 or 64-bit integers, or integer*8 in Fortran), you must use ilp64 interface of MKL.

So, choice of linking interface library (say, libmkl_intel_lp64.a orlibmkl_intel_ilp64.a) is answered directly from integer parameters passed to MKL subroutines in your program.

Do you pass ILP64 (64-bit) integers to PARDISO and LP64 (32-bit) integers to converters? I make this conclusion as you linked PARDISO with ILP64 (and it worked) and linked converters with LP64 (and it worked). If it's true, I would suggest you to try pardiso_64 functionality (available in the latest MKL 10.2.6 release). With this new interface you may not use ILP64 libraries for PARDISO, but link with LP64. However, pardiso_64 is truly 64-bit interface and accepts all integer data as INTEGER*8. But all other MKL functionality (except PARDISO) will be LP64.

P.S. Recommended link lines (libmkl_solver library is deprecated: it's not needed in latest version and left only for backward compatibility).

LP64 threaded:

$MKLPATH/libmkl_intel_lp64.a-Wl,--start-group $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

ILP64 threaded:

$MKLPATH/libmkl_intel_ilp64.a-Wl,--start-group $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

LP64 seqential:

$MKLPATH/libmkl_intel_lp64.a-Wl,--start-group $MKLPATH/libmkl_seqential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

ILP64 threaded:

$MKLPATH/libmkl_intel_ilp64.a-Wl,--start-group $MKLPATH/libmkl_seqential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

Regards,

Konstantin

Konstantin_A_Intel · ‎09-05-2010

Link to KB (knowledge base) article about pardiso_64:

http://software.intel.com/en-us/articles/a-new-ilp64-64-bit-integer-version-of-the-pardiso-solver-is-now-available/

Regards,

Konsatntin