<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MKL ScaLAPACK + MVAPICH + 100 lines of code = CRASH in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850042#M6491</link>
    <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/296294"&gt;amolins@mit.edu&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
I tried with the last MVAPICH (MPICH is WAY slower than that), and did manage to get it to fail with the last release, 1.1. It does work with the nightly build of 2009-10-09, so I will claim it was MAVPICH library fault.
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;Now, do you have access to any MKL's ScaLAPACK benchmark? I am getting an efficiency of roughly 40% when doing the benchmarking of DGEMM, DGETRF/DGETRI, and DPOTRF/DPOTRI. Is that normal?&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;A&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;
&lt;DIV&gt;my linking line is&lt;SPAN style="font-family: Menlo, sans-serif;"&gt;MKL_LNK = $(MKLPATH)/libmkl_scalapack_lp64.a -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_sequential.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/libmkl_blacs_lp64.a -Wl,--end-group -lpthread&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Menlo, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;this is 64-bit Linux.&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;I just made the same program crash again by exploring the parameter space carefully. Matrix side 10000 makes the thing crash for 36 cores.&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;Are you saying that MVAPICH is not supported? How can I use Infiniband then?&lt;/DIV&gt;</description>
    <pubDate>Wed, 14 Oct 2009 22:19:19 GMT</pubDate>
    <dc:creator>amolins</dc:creator>
    <dc:date>2009-10-14T22:19:19Z</dc:date>
    <item>
      <title>MKL ScaLAPACK + MVAPICH + 100 lines of code = CRASH</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850039#M6488</link>
      <description>The following code has proved good to generate a crash using MKL 10.2 update 2 (sequential version and threaded), last revision of MVAPICH, in two different clusters. Can anybody tell me what the problem is here? It does not crash always, but it does crash when the right number of MPI processes and matrix sizes are selected.
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;A&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;/*&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;* crash.cpp - crashes with ICC 11.1, MKL 10.2, MVAPICH 1.0 on linux 64-bit&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;*&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;both linked with the serial or threaded libraries&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;*&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;doing mpirun -np 36 crash 5000 10&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;*/&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;#include &lt;SPAN style="color: #e81d16;"&gt;&lt;STDIO.H&gt;&lt;/STDIO.H&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #787996;"&gt;#include &lt;/SPAN&gt;&lt;STDLIB.H&gt;&lt;/STDLIB.H&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #787996;"&gt;#include &lt;/SPAN&gt;&lt;STRING.H&gt;&lt;/STRING.H&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;#include &lt;SPAN style="color: #e81d16;"&gt;&lt;MATH.H&gt;&lt;/MATH.H&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;#include &lt;SPAN style="color: #e81d16;"&gt;"mpi.h"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #787996;"&gt;#include &lt;/SPAN&gt;"mkl_scalapack.h"&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #365687;"&gt;extern&lt;SPAN style="color: #000000;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #e81d16;"&gt;"C"&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt; {&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;/* BLACS C interface */&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;void&lt;/SPAN&gt; Cblacs_get( &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; context, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; request, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;* value);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; Cblacs_gridinit( &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;* context, &lt;SPAN style="color: #365687;"&gt;char&lt;/SPAN&gt; * order, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; np_row, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; np_col);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;void&lt;/SPAN&gt; Cblacs_gridinfo( &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; context, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;* np_row, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;* np_col, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;* my_row, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;* my_col);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; numroc_( &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *n, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *nb, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *iproc, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *isrcproc, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *nprocs);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;/* PBLAS */&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;void&lt;/SPAN&gt; pdgemm_(&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;char&lt;/SPAN&gt; *TRANSA, &lt;SPAN style="color: #365687;"&gt;char&lt;/SPAN&gt; *TRANSB, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * M, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * N, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * K, &lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; * ALPHA,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; * A, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * IA, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * JA, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * DESCA, &lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; * B, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * IB, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * JB, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * DESCB,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; * BETA, &lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; * C, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * IC, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * JC, &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; * DESCC );&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;}&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;#define BLOCK_SIZE &lt;SPAN style="color: #365687;"&gt;65&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; main( &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; argc, &lt;SPAN style="color: #365687;"&gt;char&lt;/SPAN&gt;* argv[] )&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;{&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; iam, nprocs;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;MPI_Init&lt;/SPAN&gt;(&amp;amp;argc,&amp;amp;argv);    &lt;SPAN style="color: #cf8635;"&gt;/* starts MPI */&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;MPI_Comm_rank&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;MPI_COMM_WORLD&lt;SPAN style="color: #000000;"&gt;, &amp;amp;iam);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;MPI_Comm_size&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;MPI_COMM_WORLD&lt;SPAN style="color: #000000;"&gt;, &amp;amp;nprocs);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// get done with the ones that are not part of the grid&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; blacs_pgrid_size = &lt;SPAN style="color: #587fa5;"&gt;floor&lt;/SPAN&gt;(&lt;SPAN style="color: #587fa5;"&gt;sqrt&lt;/SPAN&gt;(nprocs));&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;if&lt;/SPAN&gt; (iam&amp;gt;=blacs_pgrid_size*blacs_pgrid_size) {&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;printf&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;"Bye bye world from process %d of %d. BLACS had no place for me...\n"&lt;SPAN style="color: #000000;"&gt;,iam,nprocs);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #587fa5;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;MPI_Finalize&lt;SPAN style="color: #000000;"&gt;();&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;}&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// start BLACS with square processor grid&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;if&lt;/SPAN&gt;(iam==&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;printf&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;"starting BLACS..."&lt;SPAN style="color: #000000;"&gt;);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; ictxt,nprow,npcol,myrow,mycol;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;Cblacs_get&lt;/SPAN&gt;( -&lt;SPAN style="color: #365687;"&gt;1&lt;/SPAN&gt;, &lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;, &amp;amp;ictxt );&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;Cblacs_gridinit&lt;/SPAN&gt;( &amp;amp;ictxt, &lt;SPAN style="color: #e81d16;"&gt;"C"&lt;/SPAN&gt;, blacs_pgrid_size, blacs_pgrid_size );&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;Cblacs_gridinfo&lt;/SPAN&gt;( ictxt, &amp;amp;nprow, &amp;amp;npcol, &amp;amp;myrow, &amp;amp;mycol );&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;if&lt;/SPAN&gt;(iam==&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;printf&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;"done.\n"&lt;SPAN style="color: #000000;"&gt;);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; timing;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; m,n,k,lm,ln,nbm,nbn,rounds;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; myzero=&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;,myone=&lt;SPAN style="color: #365687;"&gt;1&lt;/SPAN&gt;;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;sscanf&lt;/SPAN&gt;(argv[&lt;SPAN style="color: #365687;"&gt;1&lt;/SPAN&gt;],&lt;SPAN style="color: #e81d16;"&gt;"%d"&lt;/SPAN&gt;,&amp;amp;m);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;n=m;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;k=m;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;sscanf&lt;/SPAN&gt;(argv[&lt;SPAN style="color: #365687;"&gt;2&lt;/SPAN&gt;],&lt;SPAN style="color: #e81d16;"&gt;"%d"&lt;/SPAN&gt;,&amp;amp;rounds);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;nbm = &lt;/SPAN&gt;BLOCK_SIZE&lt;SPAN style="color: #000000;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #787996;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;nbn = &lt;/SPAN&gt;BLOCK_SIZE&lt;SPAN style="color: #000000;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;lm = &lt;SPAN style="color: #587fa5;"&gt;numroc_&lt;/SPAN&gt;(&amp;amp;m, &amp;amp;nbm, &amp;amp;myrow, &amp;amp;myzero, &amp;amp;nprow);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;ln = &lt;SPAN style="color: #587fa5;"&gt;numroc_&lt;/SPAN&gt;(&amp;amp;n, &amp;amp;nbn, &amp;amp;mycol, &amp;amp;myzero, &amp;amp;npcol);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; info;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt; *ipiv = &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;[lm+nbm+&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;10000000&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;]; &lt;/SPAN&gt;//adding a "little" bit of extra space just in case&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;char&lt;/SPAN&gt; ta = &lt;SPAN style="color: #365687;"&gt;'N'&lt;/SPAN&gt;,tb = &lt;SPAN style="color: #365687;"&gt;'T'&lt;/SPAN&gt;;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt; alpha = &lt;SPAN style="color: #365687;"&gt;1.0&lt;/SPAN&gt;, beta = &lt;SPAN style="color: #365687;"&gt;0.0&lt;/SPAN&gt;;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;* test1data = &lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt; &lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;[lm*ln];&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;* test2data = &lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt; &lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;[lm*ln];&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;* test3data = &lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt; &lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;[lm*ln];&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;for&lt;/SPAN&gt;(&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; i=&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;;i&lt;LM&gt;
&lt;/LM&gt;&lt;/P&gt;&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1data&lt;I&gt;=(&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;)(&lt;SPAN style="color: #587fa5;"&gt;rand&lt;/SPAN&gt;()%&lt;SPAN style="color: #365687;"&gt;100&lt;/SPAN&gt;)/&lt;SPAN style="color: #365687;"&gt;10000.0&lt;/SPAN&gt;;&lt;/I&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *test1desc = &lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt; &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;[&lt;SPAN style="color: #365687;"&gt;9&lt;/SPAN&gt;];&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *test2desc = &lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt; &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;[&lt;SPAN style="color: #365687;"&gt;9&lt;/SPAN&gt;];&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; *test3desc = &lt;SPAN style="color: #365687;"&gt;new&lt;/SPAN&gt; &lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;[&lt;SPAN style="color: #365687;"&gt;9&lt;/SPAN&gt;];&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;] = &lt;SPAN style="color: #365687;"&gt;1&lt;/SPAN&gt;;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #cf8635;"&gt;// descriptor type &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;SPAN style="color: #365687;"&gt;1&lt;/SPAN&gt;] = ictxt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #cf8635;"&gt;// blacs context &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;2&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;] = m;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// global number of rows&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;3&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;] = n;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// global number of columns&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;SPAN style="color: #365687;"&gt;4&lt;/SPAN&gt;] = nbm;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #cf8635;"&gt;// row block size &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;5&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;] = nbn;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// column block size (DEFINED EQUAL THAN ROW BLOCK SIZE)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;6&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;] = &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// initial process row(DEFINED 0)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;7&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;] = &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// initial process column (DEFINED 0)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1desc[&lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;8&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;] = lm;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;// leading dimension of local array&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;memcpy&lt;/SPAN&gt;(test2desc,test1desc,&lt;SPAN style="color: #365687;"&gt;9&lt;/SPAN&gt;*&lt;SPAN style="color: #365687;"&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;));&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;memcpy&lt;/SPAN&gt;(test3desc,test1desc,&lt;SPAN style="color: #365687;"&gt;9&lt;/SPAN&gt;*&lt;SPAN style="color: #365687;"&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt;));&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;for&lt;/SPAN&gt;(&lt;SPAN style="color: #365687;"&gt;int&lt;/SPAN&gt; iter=&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;;iter&lt;ROUNDS&gt;
&lt;/ROUNDS&gt;&lt;/P&gt;&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;{&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;if&lt;/SPAN&gt;(iam==&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;printf&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;"iter %i - "&lt;SPAN style="color: #000000;"&gt;,iter);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;//test2 = test1&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;memcpy&lt;/SPAN&gt;(test2data,test1data,lm*ln*&lt;SPAN style="color: #365687;"&gt;sizeof&lt;/SPAN&gt;(&lt;SPAN style="color: #365687;"&gt;double&lt;/SPAN&gt;));&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;//test3 = test1*test2&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;timing=&lt;SPAN style="color: #587fa5;"&gt;MPI_Wtime&lt;/SPAN&gt;();&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;pdgemm_&lt;/SPAN&gt;(&amp;amp;ta,&amp;amp;tb,&amp;amp;m,&amp;amp;n,&amp;amp;k,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;α,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test1data,&amp;amp;myone,&amp;amp;myone,test1desc,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test2data,&amp;amp;myone,&amp;amp;myone, test2desc,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;β,&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;test3data,&amp;amp;myone,&amp;amp;myone, test3desc);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;if&lt;/SPAN&gt;(iam==&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;printf&lt;/SPAN&gt;(&lt;SPAN style="color: #e81d16;"&gt;" PDGEMM = %f |"&lt;/SPAN&gt;,&lt;SPAN style="color: #587fa5;"&gt;MPI_Wtime&lt;/SPAN&gt;()-timing);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #cf8635;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;//test3 = LU(test3)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;timing=&lt;SPAN style="color: #587fa5;"&gt;MPI_Wtime&lt;/SPAN&gt;();&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;pdgetrf_&lt;/SPAN&gt;(&amp;amp;m, &amp;amp;n, test3data, &amp;amp;myone, &amp;amp;myone, test3desc, ipiv, &amp;amp;info);&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;if&lt;/SPAN&gt;(iam==&lt;SPAN style="color: #365687;"&gt;0&lt;/SPAN&gt;)&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #e81d16;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;printf&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;(&lt;/SPAN&gt;" PDGETRF = %f.\n"&lt;SPAN style="color: #000000;"&gt;,&lt;/SPAN&gt;&lt;SPAN style="color: #587fa5;"&gt;MPI_Wtime&lt;/SPAN&gt;&lt;SPAN style="color: #000000;"&gt;()-timing);&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;}&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;delete&lt;/SPAN&gt;[] ipiv;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;delete&lt;/SPAN&gt;[] test1data, test2data, test3data;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;SPAN style="color: #365687;"&gt;delete&lt;/SPAN&gt;[] test1desc, test2desc, test3desc;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; min-height: 13.0px;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #587fa5;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;MPI_Finalize&lt;SPAN style="color: #000000;"&gt;();&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #365687;"&gt;&lt;SPAN style="color: #000000;"&gt;&lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;return&lt;SPAN style="color: #000000;"&gt; &lt;/SPAN&gt;0&lt;SPAN style="color: #000000;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo;"&gt;}&lt;/P&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 09 Oct 2009 22:57:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850039#M6488</guid>
      <dc:creator>amolins</dc:creator>
      <dc:date>2009-10-09T22:57:14Z</dc:date>
    </item>
    <item>
      <title>Re: MKL ScaLAPACK + MVAPICH + 100 lines of code = CRASH</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850040#M6489</link>
      <description>&lt;BR /&gt;
&lt;P&gt;A,&lt;BR /&gt;can you check the problem with another MPI like MPICH v 1.2.x or MPICH2 v.1.1.x officially validated by Intel MKL?&lt;BR /&gt;what is your linking line? Please check the Linker Adviser:" http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/"&lt;BR /&gt;--Gennady&lt;/P&gt;</description>
      <pubDate>Sun, 11 Oct 2009 07:16:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850040#M6489</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2009-10-11T07:16:49Z</dc:date>
    </item>
    <item>
      <title>Re: MKL ScaLAPACK + MVAPICH + 100 lines of code = CRASH</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850041#M6490</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/334681"&gt;Gennady Fedorov (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;&lt;BR /&gt;
&lt;P&gt;A,&lt;BR /&gt;can you check the problem with another MPI like MPICH v 1.2.x or MPICH2 v.1.1.x officially validated by Intel MKL?&lt;BR /&gt;what is your linking line? Please check the Linker Adviser:" http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/"&lt;BR /&gt;--Gennady&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
I tried with the last MVAPICH (MPICH is WAY slower than that), and did manage to get it to fail with the last release, 1.1. It does work with the nightly build of 2009-10-09, so I will claim it was MAVPICH library fault.
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;Now, do you have access to any MKL's ScaLAPACK benchmark? I am getting an efficiency of roughly 40% when doing the benchmarking of DGEMM, DGETRF/DGETRI, and DPOTRF/DPOTRI. Is that normal?&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;A&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 Oct 2009 21:31:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850041#M6490</guid>
      <dc:creator>amolins</dc:creator>
      <dc:date>2009-10-14T21:31:53Z</dc:date>
    </item>
    <item>
      <title>Re: MKL ScaLAPACK + MVAPICH + 100 lines of code = CRASH</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850042#M6491</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/296294"&gt;amolins@mit.edu&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
I tried with the last MVAPICH (MPICH is WAY slower than that), and did manage to get it to fail with the last release, 1.1. It does work with the nightly build of 2009-10-09, so I will claim it was MAVPICH library fault.
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;Now, do you have access to any MKL's ScaLAPACK benchmark? I am getting an efficiency of roughly 40% when doing the benchmarking of DGEMM, DGETRF/DGETRI, and DPOTRF/DPOTRI. Is that normal?&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;A&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;
&lt;DIV&gt;my linking line is&lt;SPAN style="font-family: Menlo, sans-serif;"&gt;MKL_LNK = $(MKLPATH)/libmkl_scalapack_lp64.a -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_sequential.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/libmkl_blacs_lp64.a -Wl,--end-group -lpthread&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Menlo, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;this is 64-bit Linux.&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;I just made the same program crash again by exploring the parameter space carefully. Matrix side 10000 makes the thing crash for 36 cores.&lt;/DIV&gt;
&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV&gt;Are you saying that MVAPICH is not supported? How can I use Infiniband then?&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 Oct 2009 22:19:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850042#M6491</guid>
      <dc:creator>amolins</dc:creator>
      <dc:date>2009-10-14T22:19:19Z</dc:date>
    </item>
    <item>
      <title>Re: MKL ScaLAPACK + MVAPICH + 100 lines of code = CRASH</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850043#M6492</link>
      <description>&lt;BR /&gt;
&lt;P&gt;&lt;STRONG&gt;Now, do you have access to any MKL's ScaLAPACK benchmark? I am getting an efficiency of roughly 40% when doing the benchmarking of DGEMM, DGETRF/DGETRI, and DPOTRF/DPOTRI. Is that normal?&lt;/STRONG&gt;No, we don't provide the access to this benchmark.&lt;BR /&gt;You are getting 40% of efficiency on 36 cores with the matrix size 10000. This is expected result. Please check the efficiency for 8 or 16 cores? It should be better.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Are you saying that MVAPICH is not supported? How can I use Infiniband then?&lt;BR /&gt;&lt;/STRONG&gt;another MPI versions like OpenMPI, Intel MPI support Infiniband&lt;/P&gt;
&lt;P&gt;--Gennady&lt;/P&gt;</description>
      <pubDate>Fri, 16 Oct 2009 08:31:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-ScaLAPACK-MVAPICH-100-lines-of-code-CRASH/m-p/850043#M6492</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2009-10-16T08:31:41Z</dc:date>
    </item>
  </channel>
</rss>

