A large Sparse Matrix and Sparse Matrix multipy?

duan__burness · ‎12-19-2017

C=A*B

A is a large sparse matrix 10*5000000000, and B is a Sparse matrix with 5000000000*10.

A is created with csr, and B is create with csc.

each matrix may only have a few elements (NNZ=10)

How I can do this multipy? Did mkl_sparse_spmm support the multipy between csr and csc?

mecej4 · ‎12-19-2017

I do not know of any routine to do exactly what you ask for.

The problem with multiplying sparse matrices is that, in general, their product is unlikely to be sparse, or at least not as sparse as the individual matrices and, what is more of a problem, the pattern of non-zero entries in the product matrix is neither known nor easy to compute. It is best to avoid forming the product of sparse matrices, if that is at all possible.

The CSR representation of A is the same as the CSC representation of A^T. This property can be used in two ways. If the matrix is being formed from a COO representation or otherwise, interchanging the row/column coordinates is all that needs to be done to obtain the CSR representation instead of the CSC representation. Secondly, many BLAS level 2 and 3 routines give you an option to transpose one of the input matrices before performing an operation such as ?AXPY.

Ying_H_Intel · ‎12-19-2017

Hi Burness,

I recalled, the computational routines required CSR or BSR model. But it is convenient to use COO format. for example,

first mkl_sparse_?_create_coo
Creates a handle for a matrix in COO format.

then mkl_sparse_convert_csr
then mkl_sparse_spmm, will do your sparse matrix multiply.
Computes the product of two sparse matrices and
stores the result as a sparse matrix.

MKL provide sample code under MKL install folder, for example /opt/intel/mkl/examples/examples_core_c.tgz . unzip them. in Spblas*
You can refer to them.

Best Regards,

Ying

duan__burness · ‎12-19-2017

Ying H. (Intel) wrote:

Hi Burness,

I recalled, the computational routines required CSR or BSR model. But it is convenient to use COO format. for example,

first mkl_sparse_?_create_coo
Creates a handle for a matrix in COO format.

then mkl_sparse_convert_csr
then mkl_sparse_spmm, will do your sparse matrix multiply.
Computes the product of two sparse matrices and
stores the result as a sparse matrix.

MKL provide sample code under MKL install folder, for example /opt/intel/mkl/examples/examples_core_c.tgz . unzip them. in Spblas*
You can refer to them.

Best Regards,

Ying

Thanks for your reply!

I'm sorry that I have not explain it clearly.

Suppose I have 2*5billion Matrix A and 5billion*2 Matrix B multipy:

first i will init these two matrix, for 2*5billion it is ok to new the A with csr, but for 5billion*2 if we new with csr, we must new a array to store 5billion+1 colIndex(OOM), but if we new with csc, it returns 'SPARSE_STATUS_NOT_SUPPORTED' when we use mkl_sparse_spmm.

Is there a method to solve this problem?

Ying_H_Intel · ‎12-19-2017

Hello,

Hmm, if csc format, then how about first call mkl_sparse_s_create_csc based on your B array. , then call

mkl_sparse_convert_csr (you don't need create any new array, just sparse matrix * )
then call mkl_sparse_spmm

best Regards,

Ying

duan__burness · ‎12-19-2017

Ying H. (Intel) wrote:

Hello,

Hmm, if csc format, then how about first call mkl_sparse_s_create_csc based on your B array. , then call

mkl_sparse_convert_csr (you don't need create any new array, just sparse matrix * )
then call mkl_sparse_spmm

best Regards,

Ying

I will try first , but isn't it cause OOM?

duan__burness · ‎12-20-2017

duan, burness wrote:

引文：

Ying H. (Intel) 写道：

Hello,

Hmm, if csc format, then how about first call mkl_sparse_s_create_csc based on your B array. , then call

mkl_sparse_convert_csr (you don't need create any new array, just sparse matrix * )
then call mkl_sparse_spmm

best Regards,

Ying

I will try first , but isn't it cause OOM?

hi Ying:

I have try the method you share, but it seems that when I run the code before mkl_sparse_convert_csr. It happen OOM

maybe when I convert to csr, it will new a 5billion+1 array? Could you help me with this problem?

Ying_H_Intel · ‎01-04-2018

Hello,

As i see, here the key problem is that large matrix, so cause Out of memory, right? .

But you had mentioned, 2*5billion Matrix A and 5billion*2 Matrix B multiply, each matrix may only have a few elements (NNZ=10), then you don't need to call mkl function, thus avoid the big array malloc , just do multiply manually, is it ok?

Second consideration about the large matrix, 2*5billion Matrix A and 5billion*2 Matrix B multiply, how many memory on your machine and how do you store and load/ operate the matrix in your application?

If suppose 4 byte, one index array will take at least 1*5 *10^9 * 4 byte =20G memory. if you have to operation such big matrix, what i can recommend for such matrix maybe consider big data solution, like cluster sparse blas or pblas in MKL and anther sister library data analysis accelerate library, there are some algorithms to support distributed CSR sparse matrix, like SVD, PCA etc.

***Intel® Data* Analytics Acceleration Library (Intel® *DAAL*** …

https://software.intel.com/en-us/intel-daal

Best Regards,

Ying

Manuel_S_1 · ‎01-05-2018

Hi,

I had a similar issue where the CSR became more hindering than memory saving, in my case I didn't have billions of dimensions but still annoying because I had to perform the operation repeatedly and all the memory allocation and initialisation for no reason just costs a lot of time.

Since the number of nonzero entries is quite small in comparison to the dimension you'll most likely want to implement it by hand. In that case you'll be most efficient doing it in the coordinate representation in which case each of your matrices would be fully described by 20 indices, 10 values and the dimensions, i.e. 32 numbers rather than 5*10^9+10+10. Since your nnz is actually small you might even achieve some performance by sorting the entries of A and B appropriately apriori.

The SPARSE_STATUS_NOT_SUPPORTED is returned because you tried to use anything but CSR format for the matrix-matrix multiplication - at least that is what I made of it when I last attempted to use the COO representation of two matrices to compute a sparse product.

From what I can tell, the mkl_sparse_spmm function is a pretty interface to the mkl_?csrmultcsr sblas routine, i.e. it can compute the product of CSR-CSR (and potentially in extension CSC-CSC), if you try anything more exotic you get the not supported status. I believe, the analogue is true for the relationship between mkl_sparse_?_add and mkl_?csradd. This is based on my experience and I don't really know what happens behind the mkl_sparse_spmm call, perhaps someone could confirm or correct my statement?

Best,
Manuel

Alexander_K_Intel2 · ‎01-16-2018

duan, burness wrote:

Quote:

duan, burness wrote:

引文：

Ying H. (Intel) 写道：

Hello,

Hmm, if csc format, then how about first call mkl_sparse_s_create_csc based on your B array. , then call

mkl_sparse_convert_csr (you don't need create any new array, just sparse matrix * )
then call mkl_sparse_spmm

best Regards,

Ying

I will try first , but isn't it cause OOM?

hi Ying:

I have try the method you share, but it seems that when I run the code before mkl_sparse_convert_csr. It happen OOM

maybe when I convert to csr, it will new a 5billion+1 array? Could you help me with this problem?

Hi,

First of all for your matrices it is better to use spmmd function - it's look like your result is dense.

However, right now this function supported only csr format and transposition of first matrix only. So the only way i see is to divide both matrices on chanks, convert second matrix from csc to csr, call spmmd, add result in accum and remove temp data, something like it (pseudocode):

divide matrix A and B on several chunks (with 1 millions of rows, for example)

for (i in chunk) {

mkl_sparse_s_create_csc(B_i);

mkl_sparse_convert_csr(B_i, Bcsr_i);

mkl_sparse_d_spmmd(SPARSE_OPERATION_NON_TRANSPOSE,A_i, Bcsr_i, C_i);

C +=C_i;

mkl_sparse_delete(Bcsr_i);

}

Thanks,

Alex

Alexander_K_Intel2 · ‎01-16-2018

Manuel S. wrote:

Hi,

I had a similar issue where the CSR became more hindering than memory saving, in my case I didn't have billions of dimensions but still annoying because I had to perform the operation repeatedly and all the memory allocation and initialisation for no reason just costs a lot of time.

Best,

Manuel

Hi Manuel,

Can i ask which function you call repeatedly to understand how we can reduce number of memory allocation in your case?

Thanks,

Alex