Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7144 Discussions

memory allocation error for mkl_sparse_d_add

ksttr
Beginner
877 Views

Hello.

I am using inspector-excutor sparse blas in Intel® MKL.

I separated the CSR creation into a subroutine and then used mkl_sparse_d_add to add the two matrices together, as shown in the sample code below.

include "mkl_spblas.f90"
program main
    use mkl_spblas
    implicit none

    type(sparse_matrix_t) :: a1, a2, a12

    integer :: rows, cols

    integer, dimension(:), allocatable :: row_indx_a1, col_indx_a1
    integer, dimension(:), allocatable :: row_indx_a2, col_indx_a2

    integer :: stat

    rows = 27
    cols = 27

    col_indx_a1 = [1, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27]
    col_indx_a2 = [1, 7, 2, 8, 3, 9, 10, 16, 11, 17, 12, 18, 19, 25, 20, 26, 21, 27]

    call create_sample_csr(1, rows, cols, col_indx_a1, a1)
    call create_sample_csr(2, rows, cols, col_indx_a2, a2)

    stat = mkl_sparse_d_add(SPARSE_OPERATION_NON_TRANSPOSE, a1, 1d0, a2, a12)
    print *, stat

contains
    subroutine create_sample_csr(flag, rows, cols, col_indx, a_csr)
        integer, intent(in) :: flag
        integer, intent(in) :: rows, cols
        integer, dimension(:), intent(in) :: col_indx
        type(sparse_matrix_t), intent(out) :: a_csr

        integer, dimension(:) :: rows_start(rows), rows_end(rows)

        integer :: nnz
        double precision, allocatable :: values(:)

        integer :: stat

        nnz = size(col_indx)
        allocate (values(nnz))

        values = 1d0

        if (flag == 1) then
            ! call mkl_sparse_d_convert_coo_to_csr(rows, cols, nnz, row_indx, col_indx, rows_start, rows_end, values)
            rows_start = [1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, 10, 11, 12, 12, 13, 14, 14, 15, 16, 16, 17, 18, 18]
            rows_end = [2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, 10, 11, 12, 12, 13, 14, 14, 15, 16, 16, 17, 18, 18, 19]
        else
            ! call mkl_sparse_d_convert_coo_to_csr(rows, cols, nnz, row_indx, col_indx, rows_start, rows_end, values)
            rows_start = [1, 2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 10, 10, 10, 11, 12, 13, 14, 15, 16, 16, 16, 16, 17, 18]
            rows_end = [2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 10, 10, 10, 11, 12, 13, 14, 15, 16, 16, 16, 16, 17, 18, 19]

        end if

        stat = mkl_sparse_d_create_csr(a_csr, SPARSE_INDEX_BASE_ONE, rows, cols, rows_start, rows_end, col_indx, values)
        print *, stat

    end subroutine create_sample_csr

end program main

 

However, when I compiled it, I got the following error: 

Fatal glibc error: malloc.c:2599 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)
[1]    2623364 IOT instruction (core dumped)  ./a.out

 

The compile options are as follows:

$ ifort -c main.f90  -i8  -I"${MKLROOT}/include" -g -traceback

$ ifort main.o  -Wl,--start-group ${MKLROOT}/lib/libmkl_intel_ilp64.a ${MKLROOT}/lib/libmkl_sequential.a ${MKLROOT}/lib/libmkl_core.a -Wl,--end-group -lpthread -lm -ldl

 

On the other hand, this error does not occur if the subroutine is not separated into subroutines and all the subroutines are written in the same program.

What is wrong with this?
Please let me know the cause of the error.

Thank you for any help.

0 Kudos
1 Solution
Gajanan_Choudhary
866 Views

Hi @ksttr,

First off, I'm not super conversant with Fortran, but I'll try helping you out with what appears to be a problem with your reproducer.

If you look at https://www.fortran90.org/src/best-practices.html#allocatable-arrays and your reproducer, you'll notice that the `values` variable you are allocating in your subroutine will be deallocated at the end of the subroutine. Similarly, (I believe) the `rows_start` and `rows_end` arrays are also deallocated automatically as they out of scope at the end of your subroutine.

The way the sparse BLAS domain in oneMKL works is that the sparse matrix you are creating with the `mkl_sparse_d_create_csr` API from oneMKL does not own or copy over the arrays that you are passing into it, and it is users' responsibility to maintain the lifetime of the arrays through the entire time they need to use oneMKL's APIs. On the flip side, when you call `mkl_sparse_d_add`, then oneMKL allocates and owns the row/column/values arrays for the output `a12` matrix, which oneMKL frees once you call `mkl_sparse_destroy`. oneMKL does not free the row/column/values arrays you passed into it when you destroy `a1`, `a2` matrices because you own those arrays.

The arrays you are passing into oneMKL when you are calling `mkl_sparse_d_create_csr` API are valid, but they are deallocated by the time the `mkl_sparse_d_add` API is called, leading to segmentation fault.

If you change your reproducer to move out the `rows_start`, `rows_end`, and `values` arrays so that their lifetime is extended to beyond the last call to the oneMKL API, while still keeping the `mkl_sparse_d_create_csr` call inside your subroutine, then you will not get a segmentation fault.

 


On the other hand, this error does not occur if the subroutine is not separated into subroutines and all the subroutines are written in the same program.


About your above comment: This is because the lifetimes of the variables are fine in that case and oneMKL is able to use those variables without running into the deallocation problems that your reproducer currently has.

Lastly, I see in your reproducer that you have not called `mkl_sparse_destroy` API. Please be sure to add that to destroy the `a1`, `a2`, and `a12` matrices once you are done using the matrices so that there are no memory leaks in your code.

Hope that helps!

 

Gajanan Choudhary

Intel oneMKL team

 

P.S.: Below are my modifications to your reproducer that appear to work. Note that I've moved up the scope of the problematic variables so the lifetimes are correct.

 

include "mkl_spblas.f90"
program main
    use mkl_spblas
    implicit none

    type(sparse_matrix_t) :: a1, a2, a12

    integer :: rows, cols

    integer, dimension(:), allocatable :: rowStart_a1, rowEnd_a1, col_indx_a1
    integer, dimension(:), allocatable :: rowStart_a2, rowEnd_a2, col_indx_a2
    double precision, dimension(:), allocatable :: values_a1, values_a2

    integer :: stat

    rows = 27
    cols = 27

    col_indx_a1 = [1, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27]
    col_indx_a2 = [1, 7, 2, 8, 3, 9, 10, 16, 11, 17, 12, 18, 19, 25, 20, 26, 21, 27]

    call create_sample_csr(1, rows, cols, rowStart_a1, rowEnd_a1, col_indx_a1, values_a1, a1)
    call create_sample_csr(2, rows, cols, rowStart_a2, rowEnd_a2, col_indx_a2, values_a2, a2)

    stat = mkl_sparse_d_add(SPARSE_OPERATION_NON_TRANSPOSE, a1, 1d0, a2, a12)
    print *, "mkl_sparse_d_add:        ", stat
    stat = mkl_sparse_destroy(a1)
    print *, "mkl_sparse_d_destroy:    ", stat
    stat = mkl_sparse_destroy(a2)
    print *, "mkl_sparse_d_destroy:    ", stat
    stat = mkl_sparse_destroy(a12)
    print *, "mkl_sparse_d_destroy:    ", stat

contains
    subroutine create_sample_csr(flag, rows, cols, row_start, row_end, col_indx, values, a_csr)
        integer, intent(in) :: flag
        integer, intent(in) :: rows, cols
        integer, dimension(:), intent(out) :: row_start, row_end
        integer, dimension(:), intent(in) :: col_indx
        double precision, allocatable, intent(out) :: values(:)
        type(sparse_matrix_t), intent(inout) :: a_csr

        integer :: nnz

        integer :: stat

        nnz = size(col_indx)
        allocate (values(nnz))

        values = 1d0
        if (flag == 1) then
            rowStart_a1 = [1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, 10, 11, 12, 12, 13, 14, 14, 15, 16, 16, 17, 18, 18    ]
            rowEnd_a1   = [   2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, 10, 11, 12, 12, 13, 14, 14, 15, 16, 16, 17, 18, 18, 19]
        else
            rowStart_a2 = [1, 2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 10, 10, 10, 11, 12, 13, 14, 15, 16, 16, 16, 16, 17, 18    ]
            rowEnd_a2   = [   2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 10, 10, 10, 11, 12, 13, 14, 15, 16, 16, 16, 16, 17, 18, 19]
        endif

        stat = mkl_sparse_d_create_csr(a_csr, SPARSE_INDEX_BASE_ONE, rows, cols, row_start, row_end, col_indx, values)
        print *, "mkl_sparse_d_create_csr: ", stat

    end subroutine create_sample_csr

end program main

 

 

View solution in original post

2 Replies
Gajanan_Choudhary
867 Views

Hi @ksttr,

First off, I'm not super conversant with Fortran, but I'll try helping you out with what appears to be a problem with your reproducer.

If you look at https://www.fortran90.org/src/best-practices.html#allocatable-arrays and your reproducer, you'll notice that the `values` variable you are allocating in your subroutine will be deallocated at the end of the subroutine. Similarly, (I believe) the `rows_start` and `rows_end` arrays are also deallocated automatically as they out of scope at the end of your subroutine.

The way the sparse BLAS domain in oneMKL works is that the sparse matrix you are creating with the `mkl_sparse_d_create_csr` API from oneMKL does not own or copy over the arrays that you are passing into it, and it is users' responsibility to maintain the lifetime of the arrays through the entire time they need to use oneMKL's APIs. On the flip side, when you call `mkl_sparse_d_add`, then oneMKL allocates and owns the row/column/values arrays for the output `a12` matrix, which oneMKL frees once you call `mkl_sparse_destroy`. oneMKL does not free the row/column/values arrays you passed into it when you destroy `a1`, `a2` matrices because you own those arrays.

The arrays you are passing into oneMKL when you are calling `mkl_sparse_d_create_csr` API are valid, but they are deallocated by the time the `mkl_sparse_d_add` API is called, leading to segmentation fault.

If you change your reproducer to move out the `rows_start`, `rows_end`, and `values` arrays so that their lifetime is extended to beyond the last call to the oneMKL API, while still keeping the `mkl_sparse_d_create_csr` call inside your subroutine, then you will not get a segmentation fault.

 


On the other hand, this error does not occur if the subroutine is not separated into subroutines and all the subroutines are written in the same program.


About your above comment: This is because the lifetimes of the variables are fine in that case and oneMKL is able to use those variables without running into the deallocation problems that your reproducer currently has.

Lastly, I see in your reproducer that you have not called `mkl_sparse_destroy` API. Please be sure to add that to destroy the `a1`, `a2`, and `a12` matrices once you are done using the matrices so that there are no memory leaks in your code.

Hope that helps!

 

Gajanan Choudhary

Intel oneMKL team

 

P.S.: Below are my modifications to your reproducer that appear to work. Note that I've moved up the scope of the problematic variables so the lifetimes are correct.

 

include "mkl_spblas.f90"
program main
    use mkl_spblas
    implicit none

    type(sparse_matrix_t) :: a1, a2, a12

    integer :: rows, cols

    integer, dimension(:), allocatable :: rowStart_a1, rowEnd_a1, col_indx_a1
    integer, dimension(:), allocatable :: rowStart_a2, rowEnd_a2, col_indx_a2
    double precision, dimension(:), allocatable :: values_a1, values_a2

    integer :: stat

    rows = 27
    cols = 27

    col_indx_a1 = [1, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27]
    col_indx_a2 = [1, 7, 2, 8, 3, 9, 10, 16, 11, 17, 12, 18, 19, 25, 20, 26, 21, 27]

    call create_sample_csr(1, rows, cols, rowStart_a1, rowEnd_a1, col_indx_a1, values_a1, a1)
    call create_sample_csr(2, rows, cols, rowStart_a2, rowEnd_a2, col_indx_a2, values_a2, a2)

    stat = mkl_sparse_d_add(SPARSE_OPERATION_NON_TRANSPOSE, a1, 1d0, a2, a12)
    print *, "mkl_sparse_d_add:        ", stat
    stat = mkl_sparse_destroy(a1)
    print *, "mkl_sparse_d_destroy:    ", stat
    stat = mkl_sparse_destroy(a2)
    print *, "mkl_sparse_d_destroy:    ", stat
    stat = mkl_sparse_destroy(a12)
    print *, "mkl_sparse_d_destroy:    ", stat

contains
    subroutine create_sample_csr(flag, rows, cols, row_start, row_end, col_indx, values, a_csr)
        integer, intent(in) :: flag
        integer, intent(in) :: rows, cols
        integer, dimension(:), intent(out) :: row_start, row_end
        integer, dimension(:), intent(in) :: col_indx
        double precision, allocatable, intent(out) :: values(:)
        type(sparse_matrix_t), intent(inout) :: a_csr

        integer :: nnz

        integer :: stat

        nnz = size(col_indx)
        allocate (values(nnz))

        values = 1d0
        if (flag == 1) then
            rowStart_a1 = [1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, 10, 11, 12, 12, 13, 14, 14, 15, 16, 16, 17, 18, 18    ]
            rowEnd_a1   = [   2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, 10, 11, 12, 12, 13, 14, 14, 15, 16, 16, 17, 18, 18, 19]
        else
            rowStart_a2 = [1, 2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 10, 10, 10, 11, 12, 13, 14, 15, 16, 16, 16, 16, 17, 18    ]
            rowEnd_a2   = [   2, 3, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 10, 10, 10, 11, 12, 13, 14, 15, 16, 16, 16, 16, 17, 18, 19]
        endif

        stat = mkl_sparse_d_create_csr(a_csr, SPARSE_INDEX_BASE_ONE, rows, cols, row_start, row_end, col_indx, values)
        print *, "mkl_sparse_d_create_csr: ", stat

    end subroutine create_sample_csr

end program main

 

 

ksttr
Beginner
797 Views

Hi, @Gajanan_Choudhary 

 

Thank you for you reply.

I rewrote the code as you indicated and it worked!

 


The way the sparse BLAS domain in oneMKL works is that the sparse matrix you are creating with the `mkl_sparse_d_create_csr` API from oneMKL does not own or copy over the arrays that you are passing into it, and it is users' responsibility to maintain the lifetime of the arrays through the entire time they need to use oneMKL's APIs. 


 It seems that I was mistaken on this very point.

Thank you very much!

0 Kudos
Reply