mkl_sparse_sypr value update in array B

may_ka · ‎10-05-2020

Hi intel mkl team,

from trial and error I inferred that the inspector executor routines "MKL_SPARSE_?_CREATE_?" sets pointers to the supplied arrays instead of taking copies. To my knowledge this is not explicitly mentioned in the manual.

Therefore changing values in the original array directly (NOT via the update function) will affect related operation results (e.g. matrix multiplication). This holds for "mkl_sparse_sp2" and "mkl_sparse_?_mm".

However, this does not seem to hold for "mkl_sparse_?_sypr" where updating values in array B has no effect on the result of "C=A'BA". It appears as if a copy of B is generated when calling "mkl_sparse_?_sypr". A workaround is the destruction of rebuilding of B, with the subsequent requirement to destruct and rebuild C. In some applications, where "C=A'BA" is performed repeatedly and only values in B change but not the non-zero pattern, destroying B and C is an unnecessary overhead.

The code below implements "mkl_sparse_?_sypr" and shows that updating values in B directly has no effect on "C=A'BA". As lined out above this behavior appears inconsistent with other inspector executor multiplication routines and I am wondering whether that is a bug.

Further, it appears as if "mkl_sparse_?_sypr" is not supporting symmetric matrices of storage type "sparse_matrix_type_symmetric". If the code below is run on array "sbfu" the multiplication call throws "SPARSE_STATUS_NOT_SUPPORTED".

thanks for looking into this.

#include "mkl_spblas.f90"
module mod_xxx
  use mkl_spblas
  use, intrinsic :: iso_fortran_env, only: int64, real64,int32
  use iso_c_binding, only: c_ptr, c_int, c_long_long, c_double,&
    & c_f_pointer
  type :: sparse
    type(sparse_matrix_t), allocatable :: handle
    type(matrix_descr), allocatable :: d
    integer(int64), allocatable :: rp(:),cp(:)
    real(real64), allocatable :: v(:)
    integer(int64), pointer, contiguous :: rpp1(:)=>null(),rpp2(:)=>null(),cpp(:)=>null()
    real(real64), pointer, contiguous :: vp(:)=>null()
    integer:: nrows, ncols
  contains
    procedure :: iii => subiii
    procedure :: setpointer => subsetpointer
  end type sparse
contains
  subroutine subiii(this,stat)
    implicit none
    class(sparse), intent(inout) :: this
    integer, intent(inout) :: stat
    stat=MKL_SPARSE_D_CREATE_CSR(&
      &this%handle,&
      &sparse_index_base_one,&
      &this%nrows,&
      &this%ncols,&
      &this%rp(1:(size(this%rp)-1)),&
      &this%rp(2:size(this%rp)),&
      &this%cp,&
      &this%v)
  end subroutine subiii
  subroutine subsetpointer(this,stat)
    implicit none
    class(sparse), intent(inout) :: this
    integer, intent(inout) :: stat
    integer(c_int) :: index
    type(c_ptr) :: rows_start,rows_end,col_indx,values
    index=sparse_index_base_one
    stat=MKL_SPARSE_D_EXPORT_CSR(&
        &this%handle,&
        &index,&
        &this%nrows,&
        &this%ncols,&
        &rows_start,&
        &rows_end,&
        &col_indx,&
        &values)
    if(stat/=0) return
    call c_f_pointer(rows_start,this%rpp1,[this%nrows])
    call c_f_pointer(rows_end,this%rpp2,[this%nrows])
    call c_f_pointer(col_indx,this%cpp,[this%rpp2(this%nrows)-1])
    call c_f_pointer(values,this%vp,[this%rpp2(this%nrows)-1])
  end subroutine subsetpointer
  function err(stat) result(x)
    implicit none
    integer, intent(in) :: stat
    character(:), allocatable :: x
    select case(stat)
    case(SPARSE_STATUS_NOT_INITIALIZED)
      x="SPARSE_STATUS_NOT_INITIALIZED"
    case(SPARSE_STATUS_ALLOC_FAILED)
      x="SPARSE_STATUS_ALLOC_FAILED"
    case(SPARSE_STATUS_INVALID_VALUE)
      x="SPARSE_STATUS_INVALID_VALUE"
    case(SPARSE_STATUS_EXECUTION_FAILED)
      x="SPARSE_STATUS_EXECUTION_FAILED"
    case(SPARSE_STATUS_INTERNAL_ERROR)
      x="SPARSE_STATUS_INTERNAL_ERROR"
    case(SPARSE_STATUS_NOT_SUPPORTED)
      x="SPARSE_STATUS_NOT_SUPPORTED"
    case default
      x="unknown"
    end select
  end function err
end module mod_xxx
program test
  use mkl_spblas
  use mod_xxx, only: sparse, err
  use, intrinsic :: iso_fortran_env, only: int64, real64,int32
  implicit none
  type(sparse), allocatable :: sA,sbfu,sC,sbut
  real(real64), allocatable :: rA(:,:),rB(:,:),rC(:,:)
  integer :: stat=0, i
  !!@@@@@@@@@@@@@@@@@
  !!the matrix B
  !!0.5,0.0,0.1
  !!0.0,0.4,0.3
  !!0.1,0.3,0.2
  !!@@@@@@@@
  !!matrix B dense storage
  allocate(rA(3,3),rB(3,3),rC(3,3))
  rB(1,:)=(/0.5,0.0,0.1/)
  rB(2,:)=(/0.0,0.4,0.3/)
  rB(3,:)=(/0.1,0.3,0.2/)
  !!@@@@@@@@@@
  !!the matrix A
  !!0.0,1.0,0.0
  !!0.0,0.0,1.0
  !!0.0,1.0,0.0
  !!@@@@@@@@@@@
  !!matrix A dense storage
  rA(1,:)=(/0.0,1.0,0.0/)
  rA(2,:)=(/0.0,0.0,1.0/)
  rA(3,:)=(/0.0,1.0,0.0/)
  !!multiplication result from dense matrices
  rC=matmul(matmul(transpose(rA),rB),rA)
  do i=1,size(rC,1)
    write(*,"(*(g0:"",""))") rC(i,:)
  end do
  !!@@@@@@@@@@@@@@@@@@@@@@
  allocate(sA,sbfu,sbut,sC);allocate(sA%handle,sbfu%handle,sbut%handle,sC%handle)
  allocate(sA%d,sbfu%d,sbut%d,sC%d)
  !!@@@@@@@@@@
  !!matrix A sparse full storage
  sA%d%type=sparse_matrix_type_general
  sA%d%mode=SPARSE_FILL_MODE_FULL
  sA%d%diag=sparse_diag_non_unit
  sA%rp=(/1,2,3,4/)
  sA%cp=(/2,3,2/)
  sA%v=(/1.0,1.0,1.0/)
  sa%nrows=3;sa%ncols=3
  call sa%iii(stat)
  if(stat/=0) call se("err_1"//err(stat))
  !!@@@@@@@@@@@@@@
  !!matrix B sparse full storage
  sbfu%d%type=sparse_matrix_type_symmetric
  sbfu%d%mode=SPARSE_FILL_MODE_FULL
  sbfu%d%diag=sparse_diag_non_unit
  sbfu%rp=(/1,3,5,8/)
  sbfu%cp=(/1,3,2,3,1,2,3/)
  sbfu%v=(/0.5,0.1,0.4,0.3,0.1,0.3,0.2/)
  sbfu%nrows=3;sbfu%ncols=3
  call sbfu%iii(stat)
  if(stat/=0) call se("err_2"//err(stat))
  !!@@@@@@@@@@@@@@
  !!matrix B sparse upper triangular storage
  sbut%d%type=sparse_matrix_type_symmetric
  sbut%d%mode=SPARSE_FILL_MODE_UPPER
  sbut%d%diag=sparse_diag_non_unit
  sbut%rp=(/1,3,5,6/)
  sbut%cp=(/1,3,2,3,3/)
  sbut%v=(/0.5,0.1,0.4,0.3,0.2/)
  sbut%nrows=3;sbut%ncols=3
  call sbut%iii(stat)
  if(stat/=0) call se("err_3"//err(stat))
  !!@@@@@@@@@@@@@@@@@@@@@@@@@
  !!mkl_sparse_sypr 1. round
  stat=mkl_sparse_sypr(&
      &operation=sparse_operation_transpose,&
      &A=sa%handle,&
      &descrB=sbut%d,&
      &B=sbut%handle,&
      &req=SPARSE_STAGE_FULL_MULT,&
      &C=sc%handle)
  if(stat/=0) call se("err_4 "//err(stat))
  !!set pointers to internal structure of c
  call sc%setpointer(stat)
  if(stat/=0) call se("err_5 "//err(stat))
  write(*,"(*(g0:"",""))") sc%rpp1
  write(*,"(*(g0:"",""))") sc%rpp2
  write(*,"(*(g0:"",""))") sc%cpp
  write(*,"(*(g0:"",""))") sc%vp
  !!change values in B but not structure of B
  sbut%v(:)=sbut%v(:)*5._real64
  !!set pointers to internal structure of b
  call sbut%setpointer(stat)
  !!values in b have changed
  write(*,"(*(g0:"",""))") sbut%vp
  !!mkl_sparse_sypr 2. round
  stat=mkl_sparse_sypr(&
      &operation=sparse_operation_transpose,&
      &A=sa%handle,&
      &descrB=sbut%d,&
      &B=sbut%handle,&
      &req=SPARSE_STAGE_FULL_MULT,&
      &C=sc%handle)
  if(stat/=0) call se("err_6 "//err(stat))
  !!results are not different from first round
  write(*,"(*(g0:"",""))") sc%vp
  !!workaround may require destruction of B
contains
  subroutine se(c0e)
    character(*), intent(in) :: c0e
    write(*,*) c0e
    call exit(1)
  end subroutine se
end program test

Kirill_V_Intel · ‎10-09-2020

Hi @may_ka,

Let me try to address your questions.

1) "from trial and error I inferred that the inspector executor routines "MKL_SPARSE_?_CREATE_?" sets pointers to the supplied arrays instead of taking copies. To my knowledge this is not explicitly mentioned in the manual."
Right. I agree we should say it explicitly in the documentation.

2) "Therefore changing values in the original array directly (NOT via the update function) will affect related operation results (e.g. matrix multiplication). This holds for "mkl_sparse_sp2m" and "mkl_sparse_?_mm".

However, this does not seem to hold for "mkl_sparse_?_sypr" where updating values in array B has no effect on the result of "C=A'BA". It appears as if a copy of B is generated when calling "mkl_sparse_?_sypr". A workaround is the destruction of rebuilding of B, with the subsequent requirement to destruct and rebuild C. In some applications, where "C=A'BA" is performed repeatedly and only values in B change but not the non-zero pattern, destroying B and C is an unnecessary overhead.

The code below implements "mkl_sparse_?_sypr" and shows that updating values in B directly has no effect on "C=A'BA". As lined out"

Correct and this behavior is expected. Unlike sp2m and mm, the sypr routine creates an internal representation and uses it and hence the changes done for the original input arrays do not affect the result of the second sypr operation. As we recognize a need for this, we added mkl_sparse_?_update_values (https://software.intel.com/content/www/us/en/develop/documentation/mkl-developer-reference-c/top/blas-and-sparse-blas-routines/inspector-executor-sparse-blas-routines/matrix-manipulation-routines/mkl-sparse-update-values.html). I believe currently it might not interact with sypr properly but we can open a FR to support your case.

3) "Further, it appears as if "mkl_sparse_?_sypr" is not supporting symmetric matrices of storage type "sparse_matrix_type_symmetric". If the code below is run on array "sbfu" the multiplication call throws "SPARSE_STATUS_NOT_SUPPORTED"."

Not fully correct, the routine mkl_sparse_?_sypr does not support a combination of sparse_matrix_type_symmetric + SPARSE_FILL_MODE_FULL. It supports symmetric matrices only if they are provided with SPARSE_FILL_MODE_UPPER/LOWER. The documentation says that (in somewhat obscure form but still).
My question here is: do you want the result of the sypr operation also as a full matrix (not just the symmetric part of it)?

Hope this helps.

Best,
Kirill

may_ka · ‎10-11-2020

Hi @Kirill_V_Intel

thanks for the clarification.

With regard to "trial and error" I think till version 17 or 18, inspector-executor routines were taking copies, and with 19 it changed to setting pointers. So an explicit clarification in the manual would be much appreciated as it is pretty important for memory management responsibilities.

I would much appreciate if the "_update" routine would properly interact with with "_sypr". But updating means copying and therefore not taking an internal copy would be even better. However, for performance reason the latter might not be possible. As mentioned C=ABA' is an operation repeated several hundred thousand times, and any unnecessary memory reallocation is an avoidable overhead.

Are you saying that one should feed a full store matrix into the algorithm pretending it is upper/lower stored? If you are NOT saying that allowing only upper/lower storage means building an upper/lower array from a full store array for exactly this operation (assuming that other steps in the process need a full storage). Since some mkl facilities (like pardiso) and other libraries require specific storage formats, it would be good if "_sypr" would support all combinations of upper/lower and full for input and output (e.g. input=lower/upper, output=full etc.). Then the user can optimize the input and output formats with respect to the work flow requirements without unnecessary copying, creation and destruction.

Thank and best regards

Karl

Kirill_V_Intel · ‎10-18-2020

Hi @may_ka,

I'm saying that currently mkl_sparse_sypr does not support the case when descrB.type = SPARSE_MATRIX_TYPE_SYMMETRIC and descrB.mode = SPARSE_FILL_MODE_FULL. Only descrB.mode = SPARSE_FILL_MODE_LOWER and SPARSE_FILL_MODE_UPPER are supported (and the documentation says that).

Let us know ( = confirm) if what you need is that mkl_sparse_sypr works with SPARSE_FILL_MODE_FULL.

@Gennady_F_Intel, could you open a FR for extending support of mkl_sparse_update_values to interact nicely with mkl_sparse_sypr?

Best,
Kirill

Steveni · ‎10-11-2020

Hiii,

Thank you for posting in Intel mkl sparse_sypr value update in array B. It is really helpful.

Gennady_F_Intel · ‎10-19-2020

Ok. We will add this FR to our future plans.

Khang_N_Intel · ‎04-06-2022

Hi Karl,

Can you clarify whether you expect to have FULL C as well, not only upper triangular ( sypr: C=ABA^T)?

Thanks,

Khang

may_ka · ‎04-12-2022

Hi,

it would certainly be helpful if the the routine were accepting B as FULL, and would adjust the output to the type of C, which could be inferred from C's descriptor.

Further, it would be even more helpful if the original problem of the thread could be addressed .......... the memory management from repeated C=ABA' operations with unchanged non-zero patterns in A and B.

Thanks

MariaZh · ‎04-13-2022

Hi,
I agree that mkl_sparse_?_create_csr description should be more specific regarding ownership of input arrays (such as row_ptr, col_indx and values).
Recommended (and the only one that could guarantee the correct result afterwards) way of changing values in already created sparse handle would be to use mkl_sparse_?_update_values, the reason is that after creation of sparse handle some optimizations could have been made using original input data (incl. conversion to some other format) and this needs to be updated as well for further computations, otherwise you may end up with invalid handle and incorrect results.

As for your usage case, if I understood it correctly, if you simply want to change values in B (and not structure), two-stage approach has a SPARSE_STAGE_FINALIZE_MULT step, which would only re-calculate the values in C (so the same sparse handle for C every time, no additional allocations), so if you don't need to keep a previous copy of C, I believe this should work for you. Plus, switching to mkl_sparse_?_update_values means that you won't have to do a destroy/create on matrix B, so no additional allocations here as well.

mkl_sparse_sypr indeed doesn't work with FULL SYMMETRIC matrices, but 2 calls of sp2m should help in this case, which also supports two-stage approach.

Hope this helps,
Maria