<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic LAPACK vs ScaLAPACK in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/LAPACK-vs-ScaLAPACK/m-p/858406#M1575</link>
    <description>Hello,&lt;BR /&gt;&lt;BR /&gt;I would like to ask which way in your opinion will be more optimal:&lt;BR /&gt;&lt;BR /&gt;1) Use some grid tool to create virtual supercomputer from networked desktops and LAPACK functions from MKL (does LAPACK scale automaticly code to n processors/cores ? )&lt;BR /&gt;&lt;BR /&gt;2) Use cluster created from networked desktops and ScaLAPACK functions with MPI&lt;BR /&gt;&lt;BR /&gt;Thank you for anwer and best wishes</description>
    <pubDate>Tue, 30 Mar 2010 07:13:02 GMT</pubDate>
    <dc:creator>rabbitsoft</dc:creator>
    <dc:date>2010-03-30T07:13:02Z</dc:date>
    <item>
      <title>LAPACK vs ScaLAPACK</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/LAPACK-vs-ScaLAPACK/m-p/858406#M1575</link>
      <description>Hello,&lt;BR /&gt;&lt;BR /&gt;I would like to ask which way in your opinion will be more optimal:&lt;BR /&gt;&lt;BR /&gt;1) Use some grid tool to create virtual supercomputer from networked desktops and LAPACK functions from MKL (does LAPACK scale automaticly code to n processors/cores ? )&lt;BR /&gt;&lt;BR /&gt;2) Use cluster created from networked desktops and ScaLAPACK functions with MPI&lt;BR /&gt;&lt;BR /&gt;Thank you for anwer and best wishes</description>
      <pubDate>Tue, 30 Mar 2010 07:13:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/LAPACK-vs-ScaLAPACK/m-p/858406#M1575</guid>
      <dc:creator>rabbitsoft</dc:creator>
      <dc:date>2010-03-30T07:13:02Z</dc:date>
    </item>
    <item>
      <title>LAPACK vs ScaLAPACK</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/LAPACK-vs-ScaLAPACK/m-p/858407#M1576</link>
      <description>&lt;DIV id="tiny_quote"&gt;
                &lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=469098" class="basic" href="https://community.intel.com/en-us/profile/469098/"&gt;rabbitsoft&lt;/A&gt;&lt;/DIV&gt;
                &lt;DIV style="background-color: #e5e5e5; padding: 5px; border: 1px inset; margin-left: 2px; margin-right: 2px;"&gt;&lt;I&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;I would like to ask which way in your opinion will be more optimal:&lt;BR /&gt;&lt;BR /&gt;1) Use some grid tool to create virtual supercomputer from networked desktops and LAPACK functions from MKL (does LAPACK scale automaticly code to n processors/cores ? )&lt;BR /&gt;&lt;BR /&gt;2) Use cluster created from networked desktops and ScaLAPACK functions with MPI&lt;BR /&gt;&lt;BR /&gt;Thank you for anwer and best wishes&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For Option (1)&lt;/P&gt;&lt;P&gt; MKL works on threading. If the machines are SMP, then the job will scale to number of cores, the machine has. Ultimately your job will run on one machine and will have number of threads=number of cores. So clustering does not help you to get better peformance here. &lt;/P&gt;&lt;P&gt;For Option (2)&lt;/P&gt;&lt;P&gt; As you are using Scalapack and MPI here, the job will spread into multiple machines of the cluster. If machines are SMP boxes, you can use MPI + threading to get optimal performance. For example, if the cluster has 5 machines and each of them are an SMP of quadcore, then you can run the job with&lt;/P&gt;&lt;P&gt;(a) 5 mpi processes on 5 machines and 4 MKL threads per mpi process. Each machine will have 1 mpi process &amp;amp; 4 MKL threads.&lt;/P&gt;&lt;P&gt; OR&lt;/P&gt;&lt;P&gt;(b) 10 mpi processes on 5 machines and 2 MKL threads per mpi process. Each machine will have 2 mpi process &amp;amp; 4 MKL threads.&lt;/P&gt;&lt;P&gt;Use the -machinefile option of mpirun/mpiexec to control number of mpi process on each machine and MKL_NUM_THREADS=&amp;lt;4 or 2&amp;gt; to control MKL threads.&lt;/P&gt;&lt;P&gt;So, I suggest you to go with Option (2) and evaluate the performance for both (a) &amp;amp; (b)&lt;/P&gt;</description>
      <pubDate>Mon, 05 Apr 2010 05:40:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/LAPACK-vs-ScaLAPACK/m-p/858407#M1576</guid>
      <dc:creator>Sangamesh_B_</dc:creator>
      <dc:date>2010-04-05T05:40:20Z</dc:date>
    </item>
  </channel>
</rss>

