Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Pardiso Multiple right hand sides.

Mistry__Mital
Beginner
763 Views
Dear All,

I am trying to solve AX = B for X, using Pardiso and X and B are matrices instead of vectors. So I am solving using Multiple right hand side for the smaller cases. But now I am solving the same problem with increased size of A and B.

The case I am running is having size of A (1008777 * 1008777) and of B ( 1008777 * 9000). That means I have 9000 right hand sides. I am facing all memory issues. Do any one has any suggestion how to deal with this problem. As I have successfully worked with a problem which has size of A (60000 *60000) and of B (60000 * 1800) by breaking right hand side into segments.

I can Also work with my current problem by taking one right hand side at a time. But it is very time consuming task and it took around 20 hrs for solving for one right hand side.

Note: my matrix A is symmetric.

Any idea for allocating or assigning memories to matrices?

Thank you,

Regards,
Mital


**** Also when I so one righe hand side and allocate the memory using

ALLOCATE(X1(1008777,500))...

My program do not proceeds and it gives following output.


**********************
PARDISO license check was successful ...
Reordering completed ...
Number of nonzeros in factors = 0
Number of factorization MFLOPS = 0
Factorization completed ...
********************************

0 Kudos
19 Replies
Konstantin_A_Intel
763 Views
Hello Mital,

Which version of PARDISO do you use? MKL or from University of Basel?

In general, I would suggest you thatallocating data for all your right-hand sides (RHS) is not possible on most of machines:1008777 * 9000 ~ 9E+9 elements > 50 Gb of memory. And I hope that you use 64-bit OS as 32-bit systems are completely inappropriate here with their 3-4Gb memory limitation.
So, you're definetely on the right way when try to split X and B matrices to smaller parts. I would suggest you do as follows:
Perform phase=11 (Reordering and symbolic fact.)
Perform phase=22 (Numeric factorization)
nb=50; // It would be a number of RHS vectors to solve at once. You may vary this: 30, 100 etc., but I suspect that 500 is may be too high.
allocate X and B of size (1008777, nb)
for(i=1; i<9000; i=i+nb) {
read RHS from i to (i+nb-1) to B
Perform solving for this nb vectors
write solution X (for vectors from i to (i+nb-1)) to disk
}
I hope this should work fine. An it's strange that you mentioned that solving each RHS takes ~20hrs as far as usually factorization may take so much time, but we need to perform factorization only once for all RHS.
0 Kudos
Mistry__Mital
Beginner
763 Views
Hi Konstantin,

Thanks for this explaination, I was working on some other problem and now I am back to this issue. I will try the way you have suggested.

I am using 64 bit machine as I am running cases on the Universityn cluster that has high computation power. I mean from a memory point of view. We have machines with 4GB per processor and 2 processors per node. We have more then 150 nodes available.

Can you suggest me how can I make best use of this available resources. Can I use more then one processors or multiple nodes for solving this problem. I can give you further information of the resources if you think its fruitful..

Thank you again....

Regards,

Mital
0 Kudos
Mistry__Mital
Beginner
763 Views
Dear Konstantin,

If I use loop inside the program for taking set for calculation... It gives me following message.

%*************************************************************

ifort: command line warning #10156: ignoring option '-i'; no argument required
DONE0
PARDISO license check was successful ...
DONE2
Reordering completed ...
Number of nonzeros in factors = 0
Number of factorization MFLOPS = 0
Factorization completed ...
forrtl: severe (71): integer divide by zero
Image PC Routine Line Source
a.out 000000000041638D Unknown Unknown Unknown
a.out 00000000004155FC Unknown Unknown Unknown
libc.so.6 00002AAAAB2C1994 Unknown Unknown Unknown
a.out 0000000000415509 Unknown Unknown Unknown

%***************************************************************

I am using MKL 10.2 version.

I get this error many times, I don't know the reason for this....

I have attached the Code file and the job script that I am using for running the code...

If I solve all 300 RHS at a time then I am able to solve it but when I divide RHS in the set of 3
I get this error. This the example set I am running before I jump to a big problem.

Mital
0 Kudos
Konstantin_A_Intel
763 Views
Hello Mital,
From the code.txt I can conclude that the interface of PARDISO you use differs from MKL PARDISO and looks similar to PARDISO from University of Basel (for example, DPARM array is not presented in MKL PARDISO).
Please, first of all, make sure you use MKL PARDISO interface correctly: look atexamples/solver/source/pardiso_sym_f.f for reference.
Also, please print value of IPARM(64) after reordering stage. In MKL PARDISO it should contain MKL version.
Thank you,
Konstantin
0 Kudos
Mistry__Mital
Beginner
763 Views
Thanks Konstantin,

I am using MKL PARDISO, As it is in the mkl folder where it is stored...

Here is the Path where I am using this solver.

********************************************
export OMP_NUM_THREADS=1
export MKLPATH=/cvos/shared/apps/intel/mkl/10.2.1.017/lib/em64t
export MKLINCLUDE=/cvos/shared/apps/intel/mkl/10.2.1.017/include
cd $PBS_O_WORKDIR
ifort -mcmodel=large -i-shared-heap-arrays[10] -m64 SPARSE_SOLVER.f90 -o b.out -L$MKLPATH -I$MKLINCLUDE $MKLPATH/libmkl_solver_ilp64.a -Wl,--start-group $MKLPATH/l

**************************************************

This is the path I am using for calling mkl solver.

I will still check my code as per the example you have suggested.

But The code was running perfect when I was not using Loop for the multiple RHS.

Thanks again..
Regards,
Mital
0 Kudos
Gennady_F_Intel
Moderator
763 Views
Mital,
your linking line is not complete. Plese look here how to link your application by the properly way.
--Gennady
0 Kudos
Konstantin_A_Intel
763 Views
Mital,
I reproduced your error. It turned out that NRHS was reset to 0 before calling solve phase.I have modified your example as follows and it worked fine:
1) DPARM parameter was removed completely (no DPARM in MKL PARDISO documentation).
2) Call to PARDISOINIT was removed and changed with correct initialization(no PARDISOINIT in MKL PARDISO documentation):
do i = 1, 64
iparm(i) = 0
end do
iparm(1) = 1 ! no solver default
iparm(2) = 2 ! fill-in reordering from METIS
iparm(8) = 1 ! numbers of iterative refinement steps
iparm(10) = 13 ! perturbe the pivot elements with 1E-13
iparm(18) = -1 ! Output: number of nonzeros in the factor LU
iparm(19) = -1 ! Output: Mflops for LU factorization
iparm(60) = 0 ! OOC version will be run
error = 0 ! initialize error flag
msglvl = 0 ! print statistical information
mtype = -2 !
maxfct = 1
mnum = 1
do i = 1, 64
pt(i) = 0
end do
3) About link line: you used ilp64 and sequential libraries. Was it done intentionally? You would gain much of performance if use lp64+thread libraries. And from performance point of view, as well, i would suggest you to use NRHS=16 or 32 instead of 3.
4) Your PARDISO call with phase=33 doesn't seem correct as well: you should pass B(1, l) instead of B (that's B(1, 1) and contains the same data during all iterations).
Best regards,
Konstantin
P.S. As Gennady suggested, try to use correct link line, e.g.:
-L$MKL_PATH/lib/em64t -lmkl_intel_ilp64 -Wl,--start-group -lmkl_intel_thread -lmkl_core -Wl,--end-group-liomp5 -lpthread -lm
0 Kudos
Mistry__Mital
Beginner
763 Views
Thanks to you and Gennady,

I was not using link line intentionally, As I worked first time and it was working ok so I didn't looked into it for optimization. But the case is too big so I will go through as per your suggestions.

Mital
0 Kudos
Mistry__Mital
Beginner
763 Views
Hi ALL,,

My previous problem of inverting matrix is solved but now I am having trouble in the multliplication of the matrix C = A*B where A is sparse matrix and B is dense matrix.

Size of A is 8520*600000 with around 230000 non zeros. and the size of B is 600000*10

********************************************************
I am using following commands.

call mkl_dcoomm(transa,mm,nnn,kk,alpha,matdescra,val,rowind,colind,nnz,b,ldb,beta,c,ldc)

with following parameters,

mm = 8520 ! number of rows of kms
nnn = 10 ! number of NRHS
kk = 600000
alpha = 1.0
matdescra(1) = 'G'
!matdescra(2) = ''
!matdescra(3) = ''
matdescra(4) = 'F'
val = vkms
rowind = jkms
colind = ikms
nnz = 230000
b = X
ldb = 600000
beta = 0
ldc = 8520


And following linking I am using.



export MKLPATH=/opt/intel/Compiler/11.1/072/mkl/lib/64
export MKLINCLUDE=/opt/intel/Compiler/11.1/072/mkl/include
cd $PBS_O_WORKDIR
-L$MKLPATH -I$MKLINCLUDE $MKLPATH/libmkl_solver_ilp64.a -Wl,--start-group $MKLPATH/libmkl_intel_ilp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a $MKLPATH/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -openmp -lpthread



where X comes from the inversion of the matrix...... So inversion is working perfectly.

On running the case I am having following error.

forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
. 400000000090ED90 Unknown Unknown Unknown
. 400000000002BDC0 Unknown Unknown Unknown
. 400000000000E460 Unknown Unknown Unknown
. 4000000000008CB0 Unknown Unknown Unknown
. 4000000000007DD0 Unknown Unknown Unknown
. 4000000000004ED0 Unknown Unknown Unknown
libc.so.6.1 2000000000587870 Unknown Unknown Unknown
. 4000000000004D00 Unknown Unknown Unknown


**********************************************


Thank you,,,


Mital
0 Kudos
Sergey_P_Intel2
Employee
763 Views

Hi Mital,

There are several sparse matrix formats which are supported by MKL Sparse BLAS. Particularly,mkl_dcoomm function multiplies sparse matrix Ain the COORDINATE format. Which sparse format is used for representation of matrix A in your program?
Probably your matrix is stored in compressed sparse row (CSR) format like for PARDISO? In this case you should use mkl_dcsrmm function in your application.

Regards,
Sergey

0 Kudos
Gennady_F_Intel
Moderator
763 Views
Hi Mital,
looking at the mkl path you are using:
export MKLPATH=/opt/intel/Compiler/11.1/072/mkl/lib/64
that's mean you want to work on IA64 Architecture ( aka Itanium, Itanium2).
In the case if you are working onIntel 64 Architecture
then change the mkl path as follow
export MKLPATH=/opt/intel/Compiler/11.1/072/mkl/lib/em64
and check this problem again.
--Gennady
0 Kudos
Mistry__Mital
Beginner
763 Views
Thanks Sergey & Gennady,

@ Sergey-My matrix is in COORDINATE format,,, Am I missing any thing in linking? Because I also checked with sequential linking... But it gives the same error...

@ Gennady- I looked into the folder you said but it has only 64 option there is no em64 option in it....


Mital
0 Kudos
Mistry__Mital
Beginner
763 Views
Thanks Sergey,

I am using COORDINATE FORMAT...
0 Kudos
Gennady_F_Intel
Moderator
763 Views
it was just misprint: see into/opt/intel/Compiler/11.1/072/mkl/lib/em64t.
--Gennady
0 Kudos
Mistry__Mital
Beginner
763 Views
MKL_DCOOMM_CODE.txt

Hi Gennady,

em64t is also not there,,, It has only 64 not even 32... Before I was using other cluster and it was having all of them but due to memory requirement I am using new cluster that supports PARDISO solver and is very fast too. But it has only 64 in it.

Even I tride with em64t on other cluster,, but it's giving the same error....

I have attached the code and linking....

It gives this error at very beginning... So i dn't think it's a memory issue.... Even I have no memory constraints on the machine...

Please help me with this....

Mital
0 Kudos
Mistry__Mital
Beginner
763 Views

Dear Gennady,

I did solve the problem, and it was I was using following command for linking,

******************
$MKLPATH/libmkl_solver_ilp64.a -Wl,--start-group $MKLPATH/libmkl_intel_ilp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a $MKLPATH/libmkl_lapack95_ilp64.a -Wl,--end-group -openmp -lpthread

*******************

And Now I am using following command,

******************
$MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_sequential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

******************

So MKL_DCOOMM is working now,

But the problem is, PARDISO is not working with this command,, and I want to use both PARDISO and MKL_DCOOMM in the same program... So is there any solution....


Thanks,,,
Mital
0 Kudos
Gennady_F_Intel
Moderator
763 Views
Mital,
we couldn't open this MKL_DCOOMM_CODE.txt. Please try to attach this code follow the rules we described intothis article.
--Gennady
0 Kudos
Konstantin_A_Intel
763 Views
Hi Mital,
I would suggest you to care more about size of integer data used in your program.
If you use usual integer data (LP64 or 32-bit integers, or integer*4 in Fortran), you must use lp64 interface of MKL.
If you use long integer data (ILP64 or 64-bit integers, or integer*8 in Fortran), you must use ilp64 interface of MKL.
So, choice of linking interface library (say, libmkl_intel_lp64.a orlibmkl_intel_ilp64.a) is answered directly from integer parameters passed to MKL subroutines in your program.

Do you pass ILP64 (64-bit) integers to PARDISO and LP64 (32-bit) integers to converters? I make this conclusion as you linked PARDISO with ILP64 (and it worked) and linked converters with LP64 (and it worked). If it's true, I would suggest you to try pardiso_64 functionality (available in the latest MKL 10.2.6 release). With this new interface you may not use ILP64 libraries for PARDISO, but link with LP64. However, pardiso_64 is truly 64-bit interface and accepts all integer data as INTEGER*8. But all other MKL functionality (except PARDISO) will be LP64.
P.S. Recommended link lines (libmkl_solver library is deprecated: it's not needed in latest version and left only for backward compatibility).
LP64 threaded:
$MKLPATH/libmkl_intel_lp64.a-Wl,--start-group $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread
ILP64 threaded:
$MKLPATH/libmkl_intel_ilp64.a-Wl,--start-group $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread
LP64 seqential:
$MKLPATH/libmkl_intel_lp64.a-Wl,--start-group $MKLPATH/libmkl_seqential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread
ILP64 threaded:
$MKLPATH/libmkl_intel_ilp64.a-Wl,--start-group $MKLPATH/libmkl_seqential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread
Regards,
Konstantin
0 Kudos
Konstantin_A_Intel
763 Views
0 Kudos
Reply