Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Blocks of different sizes in ScaLAPACK?

Georgios_S_
New Contributor II
659 Views

I am performing a Cholesky factorization with Intel-MKL, which uses ScaLAPACK. I distributed the matrix, based on this example, where the matrix is distributed in blocks, which are of equal size (i.e. Nb x Mb). I tried to make it so that every block has it's own size, depending on which process it belongs, so that I can experiment more and maybe get better performance.

Check this question, in order to get a better understanding of what I am saying. I won't post my code, since it's too big (yes the minor example is too big too, I checked) and the distribution seems to work well. However, *ScaLAPACK seems to assume that the matrix is distributed in blocks of equal size?*

For example, I am using this:

    int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
    int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);

where (taken from the manual):

> NB        (global input) INTEGER
>            Block size, size of the blocks the distributed matrix is
>            split into.

So, ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***

---

If I print information like this, for an 8x8 matrix:

    std::cout << Nb << " " << Mb << " " << nrows << " " << ncols << " " << myid << std::endl;
    
I am getting this:

    3 3 5 5 0
    1 1 4 4 1
    1 1 4 4 2
    1 1 4 4 3

and with by just swapping the first two block sizes, this:

    1 1 4 4 0
    3 3 5 3 1
    1 1 4 4 2
    1 1 4 4 3

which doesn't make sense for an 8x8 matrix.

0 Kudos
1 Solution
Ying_H_Intel
Employee
659 Views

Hi George, 

Not sure if understand your question correctly.   ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***

Basically, no, we don't support variable block size in Scalapack function.  The question is at which step, you change the block size and  why you need to change the block size and what is the benefit. ? 

As you see the scalpack function may use DescA to pass the local matrix size, location etc.     The block size keep Nb XMb  during the below processing 

    int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
    int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);

A_loc get size and value based on nrows  and  ncols

descinit_(descA, &M, &N, &Mb, &Nb, &i_zero, &i_zero,&ctxt,&lld, &info);
pdpotrf_("L", &N, A_loc, &i_one, &i_one, descA, &info);

  descA[0] // descriptor type
    descA[1]  // blacs context
    descA[2]  // global number of rows
    descA[3] // global number of columns
    descA[4]  // row block size
    descA[5] ; // column block size (DEFINED EQUAL THAN ROW BLOCK SIZE)
    descA[6]  // initial process row(DEFINED 0)
    descA[7]  ; // initial process column (DEFINED 0)
    descA[8] ; // leading dimension of local array

Do you remember the distribute image, i attached?  The block size is not 1x1. it can be any < total matrix size.   for example,  mbxnb=2x2 and 4 grid. 

For examScalapck.pngple *

 

 3 3 5 5 0

mean the block is  3x3.   on grid (0, 0).    local matrix size is 5x5. 

and the value in local matrix  is 

1  1  2   4  4 

1  1   2   4  4 

6  6   7    9  9 

16 16  17   19 19 

16 16 17    19  19 

 

1 1 4 4 1

mean the block is 1 x1  on grid (0, 1) , local matrix size is 4 x4 . 

the value is 

3  3  4  4 

3  3   4  4 

8  8  9  9 

8  8  9  9 

. So you should be able to understand what is the mean of 3 3 5 3 1 .     and  the vary block size can't split the matrix correctly. 

Best Regards,
Ying 

 

 

View solution in original post

0 Kudos
1 Reply
Ying_H_Intel
Employee
660 Views

Hi George, 

Not sure if understand your question correctly.   ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***

Basically, no, we don't support variable block size in Scalapack function.  The question is at which step, you change the block size and  why you need to change the block size and what is the benefit. ? 

As you see the scalpack function may use DescA to pass the local matrix size, location etc.     The block size keep Nb XMb  during the below processing 

    int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
    int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);

A_loc get size and value based on nrows  and  ncols

descinit_(descA, &M, &N, &Mb, &Nb, &i_zero, &i_zero,&ctxt,&lld, &info);
pdpotrf_("L", &N, A_loc, &i_one, &i_one, descA, &info);

  descA[0] // descriptor type
    descA[1]  // blacs context
    descA[2]  // global number of rows
    descA[3] // global number of columns
    descA[4]  // row block size
    descA[5] ; // column block size (DEFINED EQUAL THAN ROW BLOCK SIZE)
    descA[6]  // initial process row(DEFINED 0)
    descA[7]  ; // initial process column (DEFINED 0)
    descA[8] ; // leading dimension of local array

Do you remember the distribute image, i attached?  The block size is not 1x1. it can be any < total matrix size.   for example,  mbxnb=2x2 and 4 grid. 

For examScalapck.pngple *

 

 3 3 5 5 0

mean the block is  3x3.   on grid (0, 0).    local matrix size is 5x5. 

and the value in local matrix  is 

1  1  2   4  4 

1  1   2   4  4 

6  6   7    9  9 

16 16  17   19 19 

16 16 17    19  19 

 

1 1 4 4 1

mean the block is 1 x1  on grid (0, 1) , local matrix size is 4 x4 . 

the value is 

3  3  4  4 

3  3   4  4 

8  8  9  9 

8  8  9  9 

. So you should be able to understand what is the mean of 3 3 5 3 1 .     and  the vary block size can't split the matrix correctly. 

Best Regards,
Ying 

 

 

0 Kudos
Reply