<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Blocks of different sizes in ScaLAPACK? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Blocks-of-different-sizes-in-ScaLAPACK/m-p/1023815#M19799</link>
    <description>&lt;P&gt;I am performing a Cholesky factorization with Intel-MKL, which uses ScaLAPACK. I distributed the matrix, based on this &lt;A href="https://andyspiros.wordpress.com/2011/07/08/an-example-of-blacs-with-c/"&gt;example&lt;/A&gt;, where the matrix is distributed in blocks, which are of equal size (i.e. Nb x Mb). I tried to make it so that every block has it's own size, depending on which process it belongs, so that I can experiment more and maybe get better performance.&lt;/P&gt;

&lt;P&gt;Check this &lt;A href="http://stackoverflow.com/questions/29325513/scatter-matrix-blocks-of-different-sizes-using-mpi"&gt;question&lt;/A&gt;, in order to get a better understanding of what I am saying. I won't post my code, since it's too big (yes the minor example is too big too, I checked) and the distribution seems to work well. However, *ScaLAPACK seems to assume that the matrix is distributed in blocks of equal size?*&lt;/P&gt;

&lt;P&gt;For example, I am using this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; int nrows = numroc_(&amp;amp;N, &amp;amp;Nb, &amp;amp;myrow, &amp;amp;iZERO, &amp;amp;procrows);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; int ncols = numroc_(&amp;amp;M, &amp;amp;Mb, &amp;amp;mycol, &amp;amp;iZERO, &amp;amp;proccols);&lt;/P&gt;

&lt;P&gt;where (taken from the &lt;A href="https://computing.llnl.gov/tutorials/allineaDDT/programs/trisol/numroc.f"&gt;manual&lt;/A&gt;):&lt;/P&gt;

&lt;P&gt;&amp;gt; NB &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(global input) INTEGER&lt;BR /&gt;
	&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Block size, size of the blocks the distributed matrix is&lt;BR /&gt;
	&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;split into.&lt;/P&gt;

&lt;P&gt;So, ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***&lt;/P&gt;

&lt;P&gt;---&lt;/P&gt;

&lt;P&gt;If I print information like this, for an 8x8 matrix:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; std::cout &amp;lt;&amp;lt; Nb &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; Mb &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; nrows &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; ncols &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; myid &amp;lt;&amp;lt; std::endl;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	I am getting this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; 3 3 5 5 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 1&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 2&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 3&lt;/P&gt;

&lt;P&gt;and with by just swapping the first two block sizes, this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; 1 1 4 4 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 3 3 5 3 1&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 2&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 3&lt;/P&gt;

&lt;P&gt;which doesn't make sense for an 8x8 matrix.&lt;/P&gt;</description>
    <pubDate>Thu, 23 Jul 2015 14:56:52 GMT</pubDate>
    <dc:creator>Georgios_S_</dc:creator>
    <dc:date>2015-07-23T14:56:52Z</dc:date>
    <item>
      <title>Blocks of different sizes in ScaLAPACK?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Blocks-of-different-sizes-in-ScaLAPACK/m-p/1023815#M19799</link>
      <description>&lt;P&gt;I am performing a Cholesky factorization with Intel-MKL, which uses ScaLAPACK. I distributed the matrix, based on this &lt;A href="https://andyspiros.wordpress.com/2011/07/08/an-example-of-blacs-with-c/"&gt;example&lt;/A&gt;, where the matrix is distributed in blocks, which are of equal size (i.e. Nb x Mb). I tried to make it so that every block has it's own size, depending on which process it belongs, so that I can experiment more and maybe get better performance.&lt;/P&gt;

&lt;P&gt;Check this &lt;A href="http://stackoverflow.com/questions/29325513/scatter-matrix-blocks-of-different-sizes-using-mpi"&gt;question&lt;/A&gt;, in order to get a better understanding of what I am saying. I won't post my code, since it's too big (yes the minor example is too big too, I checked) and the distribution seems to work well. However, *ScaLAPACK seems to assume that the matrix is distributed in blocks of equal size?*&lt;/P&gt;

&lt;P&gt;For example, I am using this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; int nrows = numroc_(&amp;amp;N, &amp;amp;Nb, &amp;amp;myrow, &amp;amp;iZERO, &amp;amp;procrows);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; int ncols = numroc_(&amp;amp;M, &amp;amp;Mb, &amp;amp;mycol, &amp;amp;iZERO, &amp;amp;proccols);&lt;/P&gt;

&lt;P&gt;where (taken from the &lt;A href="https://computing.llnl.gov/tutorials/allineaDDT/programs/trisol/numroc.f"&gt;manual&lt;/A&gt;):&lt;/P&gt;

&lt;P&gt;&amp;gt; NB &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(global input) INTEGER&lt;BR /&gt;
	&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Block size, size of the blocks the distributed matrix is&lt;BR /&gt;
	&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;split into.&lt;/P&gt;

&lt;P&gt;So, ***does ScaLAPACK allow distributed matrices with non-equal block sizes?***&lt;/P&gt;

&lt;P&gt;---&lt;/P&gt;

&lt;P&gt;If I print information like this, for an 8x8 matrix:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; std::cout &amp;lt;&amp;lt; Nb &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; Mb &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; nrows &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; ncols &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; myid &amp;lt;&amp;lt; std::endl;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	I am getting this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; 3 3 5 5 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 1&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 2&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 3&lt;/P&gt;

&lt;P&gt;and with by just swapping the first two block sizes, this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; 1 1 4 4 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 3 3 5 3 1&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 2&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; 1 1 4 4 3&lt;/P&gt;

&lt;P&gt;which doesn't make sense for an 8x8 matrix.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jul 2015 14:56:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Blocks-of-different-sizes-in-ScaLAPACK/m-p/1023815#M19799</guid>
      <dc:creator>Georgios_S_</dc:creator>
      <dc:date>2015-07-23T14:56:52Z</dc:date>
    </item>
    <item>
      <title>Hi George, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Blocks-of-different-sizes-in-ScaLAPACK/m-p/1023816#M19800</link>
      <description>&lt;P&gt;Hi George,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Not sure if understand your question correctly. &amp;nbsp;&amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;***does ScaLAPACK allow distributed matrices with non-equal block sizes?***&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;Basically, no, we don't support variable block size in Scalapack function. &amp;nbsp;The question is at which step, you change the block size and &amp;nbsp;why you need to change the block size and what is the benefit. ?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;As you see the scalpack function may use DescA to pass the local matrix size, location etc. &amp;nbsp; &amp;nbsp; The block size keep Nb XMb &amp;nbsp;during the below processing&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; int nrows = numroc_(&amp;amp;N, &amp;amp;Nb, &amp;amp;myrow, &amp;amp;iZERO, &amp;amp;procrows);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; int ncols = numroc_(&amp;amp;M, &amp;amp;Mb, &amp;amp;mycol, &amp;amp;iZERO, &amp;amp;proccols);&lt;/P&gt;

&lt;P&gt;A_loc get size and value based on&amp;nbsp;&lt;SPAN style="font-size: 13.0080003738403px; line-height: 19.5120010375977px;"&gt;nrows&amp;nbsp;&lt;/SPAN&gt;&amp;nbsp;and &amp;nbsp;&lt;SPAN style="font-size: 13.0080003738403px; line-height: 19.5120010375977px;"&gt;ncols&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;descinit_(&lt;STRONG&gt;descA&lt;/STRONG&gt;, &amp;amp;M, &amp;amp;N, &amp;amp;Mb, &amp;amp;Nb, &amp;amp;i_zero, &amp;amp;i_zero,&amp;amp;ctxt,&amp;amp;lld, &amp;amp;info);&lt;BR /&gt;
	pdpotrf_("L", &amp;amp;N, A_loc, &amp;amp;i_one, &amp;amp;i_one, descA, &amp;amp;info);&lt;/P&gt;

&lt;P&gt;&amp;nbsp; descA[0] // descriptor type&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[1] &amp;nbsp;// blacs context&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[2] &amp;nbsp;// global number of rows&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[3] // global number of columns&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[4] &amp;nbsp;// row block size&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[5] ; // column block size (DEFINED EQUAL THAN ROW BLOCK SIZE)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[6] &amp;nbsp;// initial process row(DEFINED 0)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[7] &amp;nbsp;; // initial process column (DEFINED 0)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; descA[8] ; // leading dimension of local array&lt;/P&gt;

&lt;P&gt;Do you remember the distribute image, i attached? &amp;nbsp;The block size is not 1x1. it can be any &amp;lt; total matrix size. &amp;nbsp; for example, &amp;nbsp;mbxnb=2x2 and 4 grid.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;For exam&lt;span class="lia-inline-image-display-wrapper" image-alt="Scalapck.png"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/7759iECCC02080C127ADD/image-size/large?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="Scalapck.png" alt="Scalapck.png" /&gt;&lt;/span&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;ple&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;*&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;3 3 5 5 0&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;mean the block is &amp;nbsp;3x3. &amp;nbsp; on grid (0, 0). &amp;nbsp; &amp;nbsp;local matrix size is 5x5.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;and the value in local matrix &amp;nbsp;is&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;1 &amp;nbsp;1 &amp;nbsp;2 &amp;nbsp; 4 &amp;nbsp;4&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;1 &amp;nbsp;1 &amp;nbsp; 2 &amp;nbsp; 4 &amp;nbsp;4&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;6 &amp;nbsp;6 &amp;nbsp; 7 &amp;nbsp; &amp;nbsp;9 &amp;nbsp;9&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;16 16 &amp;nbsp;17 &amp;nbsp; 19 19&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;16 16 17 &amp;nbsp; &amp;nbsp;19 &amp;nbsp;19&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;1 1 4 4 1&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;mean the block is 1 x1 &amp;nbsp;on grid (0, 1) , local matrix size is 4 x4 .&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;the value is&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;3 &amp;nbsp;3 &amp;nbsp;4 &amp;nbsp;4&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;3 &amp;nbsp;3 &amp;nbsp; 4 &amp;nbsp;4&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;8 &amp;nbsp;8 &amp;nbsp;9 &amp;nbsp;9&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;8 &amp;nbsp;8 &amp;nbsp;9 &amp;nbsp;9&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;. So you should be able to understand what is the mean of&amp;nbsp;3 3 5 3 1 . &amp;nbsp; &amp;nbsp; and &amp;nbsp;the vary block size can't split the matrix correctly.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;BR /&gt;
	Ying&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 24 Jul 2015 07:29:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Blocks-of-different-sizes-in-ScaLAPACK/m-p/1023816#M19800</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2015-07-24T07:29:39Z</dc:date>
    </item>
  </channel>
</rss>

