Community
cancel
Showing results for 
Search instead for 
Did you mean: 
kup__benny
Beginner
82 Views

Sparse BLAS CSR Matrix Storage Format

Jump to solution

After looking here, I can't understand the compressed row format.

Specifically, I don't understand pointerE (and OK with the rest): it should be the index in the valuesarray that is the last non zero element in each row, if I look on the example they give for zero based indexing, the last non zero element of the first row is -3 and it is the third element in the values array but in zero indexing it should be element #2. The last non zero element of the second row is 5 which is the fifth element in values but in zero indexing it should be #4 so pointerE should be [2,4,7,10,12] but in the example it shows [3,5,8,11,13]

what am I missing here?

Once again I am using zero based indexing so where does this bias come from?

0 Kudos

Accepted Solutions
Spencer_P_Intel
Employee
82 Views

The 4 array compressed sparse row (CSR) format uses the following definitions (0 based indexing):

  • pointerB = first element of row i in columns and values arrays
  • pointerE = one past the last element of row i in columns and values arrays
  • columns = column index value
  • values = value of matrix

If you are using the most simple case of CSR (which is the 3 array CSR format) with contiguous memory storage of all elements then you can think of pointerE as pointerB+1 (That is, pointerE = pointerB[i+1] ) using pointer arithmetic.

This is typically sufficient for most cases, but there are times when you want to do more complicated operations like pass in a sub-matrix or do a permutation of a matrix rows, in which case using the two row pointers (beginning (B) and one past ending (E)  ) makes this possible.  Notice that the Intel Math Kernel Library does support the sub-matrix usage, but currently it does guarantee correctness if the pointer arrays are used to pass in a row permutation of the matrix.

For me it helps to think of the pointerB/E arrays as defining half open half closed intervals for the rows:

CSR 3 array:

row i is found on [ pointerB, pointerB[i+1] ) in columns and values arrays

CSR 4 array: (more general)

row i is found on [ pointerB, pointerE ) in columns and values arrays

 

As you note, it does get a little more complicated with one based indexing.  If  we use the variable ind=0 or 1  to denote the indexing, then the general form is as follows:

  • pointerB-ind = first element of row i in columns and values arrays
  • pointerE-ind = one past the last element of row i in columns and values arrays
  • columns = ind-based column index value (columns-ind is 0 based column index value)
  • values = value of matrix

so that

CSR 3 array:

row i is found on [ pointerB-ind, pointerB[i+1]-ind ) in columns and values arrays (with columns stored in base ind fomat)

CSR 4 array: (more general)

row i is found on [ pointerB-ind, pointerE-ind ) in columns and values arrays (with columns stored in base ind fomat)

View solution in original post

2 Replies
Spencer_P_Intel
Employee
83 Views

The 4 array compressed sparse row (CSR) format uses the following definitions (0 based indexing):

  • pointerB = first element of row i in columns and values arrays
  • pointerE = one past the last element of row i in columns and values arrays
  • columns = column index value
  • values = value of matrix

If you are using the most simple case of CSR (which is the 3 array CSR format) with contiguous memory storage of all elements then you can think of pointerE as pointerB+1 (That is, pointerE = pointerB[i+1] ) using pointer arithmetic.

This is typically sufficient for most cases, but there are times when you want to do more complicated operations like pass in a sub-matrix or do a permutation of a matrix rows, in which case using the two row pointers (beginning (B) and one past ending (E)  ) makes this possible.  Notice that the Intel Math Kernel Library does support the sub-matrix usage, but currently it does guarantee correctness if the pointer arrays are used to pass in a row permutation of the matrix.

For me it helps to think of the pointerB/E arrays as defining half open half closed intervals for the rows:

CSR 3 array:

row i is found on [ pointerB, pointerB[i+1] ) in columns and values arrays

CSR 4 array: (more general)

row i is found on [ pointerB, pointerE ) in columns and values arrays

 

As you note, it does get a little more complicated with one based indexing.  If  we use the variable ind=0 or 1  to denote the indexing, then the general form is as follows:

  • pointerB-ind = first element of row i in columns and values arrays
  • pointerE-ind = one past the last element of row i in columns and values arrays
  • columns = ind-based column index value (columns-ind is 0 based column index value)
  • values = value of matrix

so that

CSR 3 array:

row i is found on [ pointerB-ind, pointerB[i+1]-ind ) in columns and values arrays (with columns stored in base ind fomat)

CSR 4 array: (more general)

row i is found on [ pointerB-ind, pointerE-ind ) in columns and values arrays (with columns stored in base ind fomat)

View solution in original post

kup__benny
Beginner
82 Views

Thank you Spencer,

Now everything is clear