Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6834 Discussions

## Blocks of different sizes in ScaLAPACK? New Contributor II
362 Views

I am performing a Cholesky factorization with Intel-MKL, which uses ScaLAPACK. I distributed the matrix, based on this example, where the matrix is distributed in blocks, which are of equal size (i.e. Nb x Mb). I tried to make it so that every block has it's own size, depending on which process it belongs, so that I can experiment more and maybe get better performance.

Check this question, in order to get a better understanding of what I am saying. I won't post my code, since it's too big (yes the minor example is too big too, I checked) and the distribution seems to work well. However, *ScaLAPACK seems to assume that the matrix is distributed in blocks of equal size?*

For example, I am using this:

int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);

where (taken from the manual):

> NB        (global input) INTEGER
>            Block size, size of the blocks the distributed matrix is
>            split into.

So, ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***

---

If I print information like this, for an 8x8 matrix:

std::cout << Nb << " " << Mb << " " << nrows << " " << ncols << " " << myid << std::endl;

I am getting this:

3 3 5 5 0
1 1 4 4 1
1 1 4 4 2
1 1 4 4 3

and with by just swapping the first two block sizes, this:

1 1 4 4 0
3 3 5 3 1
1 1 4 4 2
1 1 4 4 3

which doesn't make sense for an 8x8 matrix.

1 Solution Employee
362 Views

Hi George,

Not sure if understand your question correctly.   ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***

Basically, no, we don't support variable block size in Scalapack function.  The question is at which step, you change the block size and  why you need to change the block size and what is the benefit. ?

As you see the scalpack function may use DescA to pass the local matrix size, location etc.     The block size keep Nb XMb  during the below processing

int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);

A_loc get size and value based on nrows  and  ncols

descinit_(descA, &M, &N, &Mb, &Nb, &i_zero, &i_zero,&ctxt,&lld, &info);
pdpotrf_("L", &N, A_loc, &i_one, &i_one, descA, &info);

descA // descriptor type
descA  // blacs context
descA  // global number of rows
descA // global number of columns
descA  // row block size
descA ; // column block size (DEFINED EQUAL THAN ROW BLOCK SIZE)
descA  // initial process row(DEFINED 0)
descA  ; // initial process column (DEFINED 0)
descA ; // leading dimension of local array

Do you remember the distribute image, i attached?  The block size is not 1x1. it can be any < total matrix size.   for example,  mbxnb=2x2 and 4 grid.

For exam ple *

3 3 5 5 0

mean the block is  3x3.   on grid (0, 0).    local matrix size is 5x5.

and the value in local matrix  is

1  1  2   4  4

1  1   2   4  4

6  6   7    9  9

16 16  17   19 19

16 16 17    19  19

1 1 4 4 1

mean the block is 1 x1  on grid (0, 1) , local matrix size is 4 x4 .

the value is

3  3  4  4

3  3   4  4

8  8  9  9

8  8  9  9

. So you should be able to understand what is the mean of 3 3 5 3 1 .     and  the vary block size can't split the matrix correctly.

Best Regards,
Ying Employee
363 Views

Hi George,

Not sure if understand your question correctly.   ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***

Basically, no, we don't support variable block size in Scalapack function.  The question is at which step, you change the block size and  why you need to change the block size and what is the benefit. ?

As you see the scalpack function may use DescA to pass the local matrix size, location etc.     The block size keep Nb XMb  during the below processing

int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);

A_loc get size and value based on nrows  and  ncols

descinit_(descA, &M, &N, &Mb, &Nb, &i_zero, &i_zero,&ctxt,&lld, &info);
pdpotrf_("L", &N, A_loc, &i_one, &i_one, descA, &info);

descA // descriptor type
descA  // blacs context
descA  // global number of rows
descA // global number of columns
descA  // row block size
descA ; // column block size (DEFINED EQUAL THAN ROW BLOCK SIZE)
descA  // initial process row(DEFINED 0)
descA  ; // initial process column (DEFINED 0)
descA ; // leading dimension of local array

Do you remember the distribute image, i attached?  The block size is not 1x1. it can be any < total matrix size.   for example,  mbxnb=2x2 and 4 grid.

For exam ple *

3 3 5 5 0

mean the block is  3x3.   on grid (0, 0).    local matrix size is 5x5.

and the value in local matrix  is

1  1  2   4  4

1  1   2   4  4

6  6   7    9  9

16 16  17   19 19

16 16 17    19  19

1 1 4 4 1

mean the block is 1 x1  on grid (0, 1) , local matrix size is 4 x4 .

the value is

3  3  4  4

3  3   4  4

8  8  9  9

8  8  9  9

. So you should be able to understand what is the mean of 3 3 5 3 1 .     and  the vary block size can't split the matrix correctly.

Best Regards,
Ying 