Solved: Wrong results for SpMV

jaidotsh · ‎02-12-2012

Hello,

While doing SpMVusing MKL routines I got wrong (inaccurate) results for both COO and CSR formats. I checked it with several matrices eg:ck400. I have taken 1-based index matrices from the .mtx files. I observed that single precision results are still worse, resulting in huge numbers.

Output of y = Ax (x=1) for COO and CSR routines

0.792798

3.828488

3.575541

0.787644

2.179928

0.935596

0.826411

2.150882

1.681057

1.230554

Output of my implementation

1.594622

3.375540

2.971066

1.472197

1.733999

0.933401

0.841957

1.671079

1.527866

Why is this happening?

Thanks in advance.

Regards,

Jay

mecej4 · ‎02-12-2012

The Harwell-Boeing format stores the matrix by packing column-1, followed by packing column-2, etc, which is the traditional storage order in Fortran.

By interpreting the columns as rows, in effect you are using the transpose of the matrix, and the matrix is not symmetric. If that is what you really desire, simply change the first subroutine argument from 'T' to 'N' in my listing and you will get the results in #3.

Or, in your C code, change the first argument to 'T'.

View solution in original post

Gennady_F_Intel · ‎02-12-2012

Could you please give us the example of the code which we can compile and check on our side.

There are many similar requests on mkl forum, but mostly the cause of the problem is incorrect usage ot this sort of routines.

There are numbers of examples of program for using MKL Sparse BLAS routines for matrices represented in the different sparse formats including COO and CSR.

I would recomend to see at these examples first of all.

You can find these example ( cspblas_ccoo.c cspblas_ccsr.c) < MKLROOT>\examples\spblasc\

mecej4 · ‎02-12-2012

The results that you showed under "Output of y = Ax (x=1) for COO and CSR routines" are correct. They are what I get from the MKL call

[bash]      call mkl_scsrgemv('T',NCOL,values,colptr,rowind,x,y)
[/bash]

after reading in the matrix "ck400" using the skeleton code from

http://math.nist.gov/MatrixMarket/src/hbcode1.f

You did not say anything about "my implementation", yet you ask us to tell you what is wrong with it. Well, there are zillion ways of calling MKL (or any other library) incorrectly.

For reference, the complete code follows.

[fxfortran]      program jdotsh
      implicit none
      include 'mkl_spblas.fi'
      integer NMAX, NNZMAX,I, LUNIT
      parameter (NMAX=400, NNZMAX=2860, LUNIT=11)
!     ================================================================
!     ... SAMPLE CODE FOR READING A SPARSE MATRIX IN STANDARD FORMAT
!     ================================================================

      CHARACTER      TITLE*72 , KEY*8    , MXTYPE*3 ,  &
                     PTRFMT*16, INDFMT*16, VALFMT*20, RHSFMT*20

      INTEGER        TOTCRD, PTRCRD, INDCRD, VALCRD, RHSCRD, &
                     NROW  , NCOL  , NNZERO, NELTVL

      INTEGER        COLPTR (NMAX+1), ROWIND (NNZMAX)

      REAL           VALUES (NNZMAX), X(NMAX), Y(NMAX)

!    ------------------------
!     ... READ IN HEADER BLOCK
!     ------------------------
      open(unit=LUNIT,file='ck400.rua',status='old')

      READ ( LUNIT, 1000 ) TITLE , KEY   ,   &
                           TOTCRD, PTRCRD, INDCRD, VALCRD, RHSCRD,  &
                           MXTYPE, NROW  , NCOL  , NNZERO, NELTVL,  &
                           PTRFMT, INDFMT, VALFMT, RHSFMT
 1000 FORMAT ( A72, A8,/,5I14,/ A3, 11X, 4I14,/,2A16, 2A20 )

!     -------------------------
!     ... READ MATRIX STRUCTURE
!     -------------------------

      READ ( LUNIT, PTRFMT ) ( COLPTR (I), I = 1, NCOL+1 )

      READ ( LUNIT, INDFMT ) ( ROWIND (I), I = 1, NNZERO )

      IF  ( VALCRD .GT. 0 )  THEN

!         ----------------------
!         ... READ MATRIX VALUES
!         ----------------------

          READ ( LUNIT, VALFMT ) ( VALUES (I), I = 1, NNZERO )

      ENDIF
      X(1:NCOL) = 1.0
      call mkl_scsrgemv('T',NCOL,values,colptr,rowind,x,y)
      write(*,2000)y(1:10)
 2000 format(1x,5ES16.7)
      end program jdotsh
[/fxfortran]

jaidotsh · ‎02-12-2012

@mecej4

Thanks for the code. I'm not that good in fortran, anyways I have attached my MKL kernel call code with this post. Actually, I'm not supposed to disclose my implementation. But I'll show you that my results are right

When I do SpMV from ck400 matrix I get these results

Eg: First row

[bash]1 1  4.5988460100000e-01
2 1  3.6284045000000e-02
5 1  1.0781816600000e+00
6 1  2.8797109600000e-02
203 1 -8.6315697400000e-03
204 1  1.0598980300000e-04[/bash]

[bash]0.459+0.0036+1.07+0.028-0.0086+0.00010 = 1.594622[/bash]

Fourth row

[bash]3 4 -2.4700304700000e-02
4 4  4.4020096400000e-01
7 4 -1.2549445700000e-02
8 4  1.0782397800000e+00
201 4 -1.0598980300000e-04
202 4 -8.8883786800000e-03
[/bash]

[bash]-0.024+0.44-0.012+1.07-0.00010-0.0088 = 1.472197[/bash]

which is same as the results of my implementation

@Fedorov I'll have a look at it. I've attached the routine calling codes.

mecej4 · ‎02-12-2012

The Harwell-Boeing format stores the matrix by packing column-1, followed by packing column-2, etc, which is the traditional storage order in Fortran.

By interpreting the columns as rows, in effect you are using the transpose of the matrix, and the matrix is not symmetric. If that is what you really desire, simply change the first subroutine argument from 'T' to 'N' in my listing and you will get the results in #3.

Or, in your C code, change the first argument to 'T'.

jaidotsh · ‎02-12-2012

@mecej4 Thanks a lot, it worked. @Fedorov So true!

jaidotsh · ‎02-17-2012

I'm stuck with another problem!!. When I give matrices like ck400,pwtk

But, when I give consph, rma10I get a segmentation fault. I've thoroughly checked the indices from the

file, I still can't find the error!

Thanks in advance!