Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
- Serious memory leak problem of mkl_sparse_d_add subroutine

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

yang__xiaodong

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-12-2019
07:34 PM

110 Views

Hi,

I'm currently programming with the new sparse interface, and experienced serious memory leak problem when this routine: mkl_sparse_d_add is called several thousand times, it takes up all my 64 GB memory and program cannot go on. I'm not sure whether other sparse routines has similar problems, but at least routines mkl_sparse_d_create_coo, mkl_sparse_convert_csr, mkl_sparse_d_mv do not have this problem.

Please take a check of it, thank you very much!

Accepted Solutions

Highlighted

Kirill_V_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2019
09:07 PM

110 Views

Hello,

The thing is, the routine mkl_sparse_?_add allocates the output matrix always. So, when you call it in a loop with the same matrix handle A as both input and output arguments, each time the memory has been allocated and has replaced the original matrix. Since with probability one when you add two different sparse matrices you'll get an extended stencil, it is reasonable to allocate the output arrays. The same reasoning was used for other routines which create the output matrix (say, mkl_sparse_?_spmm and others).

We'll consider making the documentation more precise about it, especially for Fortran users.

So, what you can do in your case is create a temporary matrix as a buffer, destroy the input matrix and rename the temp to the original name if you want to mimic an in-place addition.

As for the COO format, there actually the info is SPARSE_STATUS_NOT_SUPPORTED, so nothing is computed in the loop where you call mkl_sparse_?_add with coordinate format.

Hope this helps!

Best,

Kirill

8 Replies

Highlighted

yang__xiaodong

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-12-2019
07:34 PM

110 Views

by the way, I'm a windows fortran MKL user.

Highlighted

yang__xiaodong

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-12-2019
08:05 PM

110 Views

more precisely, it works well with COO internal format, but mkl_sparse_d_add fails with CSR internal format, you can check the following sample code and monitoring the memory usage during running:

include "mkl_spblas.f90"

program matrix

use mkl_spblas

implicit none

integer :: info,i

integer :: d_row(10)=[1,2,3,4,5,6,7,8,9,10]

integer :: d_COL(10)=[1,2,3,4,5,6,7,8,9,10]

real(8) :: d_VAL(10)=[0,0,0,0,0,0,0,0,0,0]

type(SPARSE_MATRIX_T) :: a_mkl,b_mkl

info = mkl_sparse_d_create_coo (a_mkl, SPARSE_INDEX_BASE_one, 10, 10, 1, d_ROW, d_COL, d_VAL)

! it works well with COO internal format

do i=1,1000000

info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, a_mkl, 1.d0, a_mkl, a_mkl)

end do

! but mkl_sparse_d_add fails with CSR internal format

info = mkl_sparse_convert_csr (a_mkl, SPARSE_OPERATION_NON_TRANSPOSE, b_mkl)

do i=1,1000000

info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, b_mkl, 1.d0, b_mkl, b_mkl)

end do

end program

Highlighted

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2019
06:38 PM

110 Views

how did you detect the memleeks?

which version of mkl do you use?

64 or 32 bit?

did you link with treaded mode of mkl?

is that openmp or tbb?

Highlighted

yang__xiaodong

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2019
08:47 PM

110 Views

i'm using the latest 2019.5 64 bit version together with the latest VISUAL STUDIO update. It's the openmp version with qmkl:parallel option turned on. You could easily see memory usage blows if above mentioned code runs.

The code shows a comparison between COO and CSR internal formats, CSR has the problem, COO does't.

Highlighted

Kirill_V_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2019
09:07 PM

111 Views

Hello,

The thing is, the routine mkl_sparse_?_add allocates the output matrix always. So, when you call it in a loop with the same matrix handle A as both input and output arguments, each time the memory has been allocated and has replaced the original matrix. Since with probability one when you add two different sparse matrices you'll get an extended stencil, it is reasonable to allocate the output arrays. The same reasoning was used for other routines which create the output matrix (say, mkl_sparse_?_spmm and others).

We'll consider making the documentation more precise about it, especially for Fortran users.

So, what you can do in your case is create a temporary matrix as a buffer, destroy the input matrix and rename the temp to the original name if you want to mimic an in-place addition.

As for the COO format, there actually the info is SPARSE_STATUS_NOT_SUPPORTED, so nothing is computed in the loop where you call mkl_sparse_?_add with coordinate format.

Hope this helps!

Best,

Kirill

Highlighted

yang__xiaodong

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-14-2019
03:12 PM

110 Views

Thank you for your detailed reply Voronin.

Another advice is that could MKL makes the routine mkl_sparse_?_add be able to calculate formula like

C := alpha*op(A) + beta*op(B)

because this is hard to do, I have to create a 0 matrix and add separately by following codes:

info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, A_mkl,-1.d0 , A_mkl, Z_mkl) info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, A_mkl,av , Z_mkl, AT_mkl) info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, B_mkl,bv , Z_mkl, BT_mkl) info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, AT_mkl,1.d0, BT_mkl, C_mkl)

during which enconter with the mentioned problem.

Highlighted

Kirill_V_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-14-2019
07:21 PM

110 Views

Hello again,

Thanks for your feedback, we'll take it into the consideration.

As a side note, you could get away with a simpler code (two calls to add instead of three) if I am not mistaken:

// if Z_mkl is a zero matrix, Temp is an intermediate results which should be deleted after Res_mkl is computed

info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, A_mkl, alpha , Z_mkl, Temp_mkl) // Temp = alpha * A + Z = A

info = mkl_sparse_d_add (SPARSE_OPERATION_NON_TRANSPOSE, B_mkl, beta , Temp_mkl, Res_mkl) // Res = beta * B + Temp = beta * B + alpha * A

and the zero matrix Z can be more efficiently created by hand (through mkl_sparse_d_create_csr), just manually creating the CSR format arrays.

Best,

Kirill

Highlighted
Intel MKL 2020 has been released the last week. We updated SpBLAS documentation regarding the topics discussed above.

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-15-2019
09:03 PM

110 Views

For more complete information about compiler optimizations, see our Optimization Notice.