<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to choose the block size and process number? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-choose-the-block-size-and-process-number/m-p/1063888#M21831</link>
    <description>&lt;P&gt;I want to diagonalize a large matrix, which size is about 40000*40000.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Our supercomputer has 80 nodes and there are two cpus in each node with eight-core. &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;I think it is very hard to diagonalize such a large matrix just using multithread optimal lapack program in MKL, s&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;o I plan to employ the scalapack program. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;I understand that the scalapack in MKL can make use both the multithread and multiprocess power to speed up diagonalization, is it correct?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Would you please give me some advice about how many nodes and how many cores in each node I should use?&lt;/P&gt;

&lt;P&gt;What is the appropriate block size Mb and Nb for the problem?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 05 Sep 2016 06:18:59 GMT</pubDate>
    <dc:creator>Ye_C_1</dc:creator>
    <dc:date>2016-09-05T06:18:59Z</dc:date>
    <item>
      <title>How to choose the block size and process number?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-choose-the-block-size-and-process-number/m-p/1063888#M21831</link>
      <description>&lt;P&gt;I want to diagonalize a large matrix, which size is about 40000*40000.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Our supercomputer has 80 nodes and there are two cpus in each node with eight-core. &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;I think it is very hard to diagonalize such a large matrix just using multithread optimal lapack program in MKL, s&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;o I plan to employ the scalapack program. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;I understand that the scalapack in MKL can make use both the multithread and multiprocess power to speed up diagonalization, is it correct?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Would you please give me some advice about how many nodes and how many cores in each node I should use?&lt;/P&gt;

&lt;P&gt;What is the appropriate block size Mb and Nb for the problem?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Sep 2016 06:18:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-choose-the-block-size-and-process-number/m-p/1063888#M21831</guid>
      <dc:creator>Ye_C_1</dc:creator>
      <dc:date>2016-09-05T06:18:59Z</dc:date>
    </item>
    <item>
      <title>Hi,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-choose-the-block-size-and-process-number/m-p/1063889#M21832</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin: 0cm 0cm 0pt;"&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,sans-serif; font-size: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: &amp;quot;Times New Roman&amp;quot;; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US;"&gt;Just a few questions please. Why do you need&amp;nbsp;this functionality? U&lt;/SPAN&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,sans-serif; font-size: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: &amp;quot;Times New Roman&amp;quot;; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;"&gt;sually it’s needed for Eigensolver problem. Which type of matrix do you have (symmetrical/unsymmatrical)? So, which routines would you like to use?&amp;nbsp;This information will help to give better answers.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin: 0cm 0cm 0pt;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin: 0cm 0cm 0pt;"&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,sans-serif; font-size: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: &amp;quot;Times New Roman&amp;quot;; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US;"&gt;The best way is to make a few experiments in different modes. Pure MPI version,&amp;nbsp; a few OMP threads, different NBs etc. As I know, multithreading is not very efficient for such type of routines in ScaLAPACK. Usually NB in range 32-128 is a good choice. And another suggestion is to compare cluster results with single node run with LAPACK - the level of optimizations in LAPACK is higher at the moment comparing to ScaLAPACK.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin: 0cm 0cm 0pt;"&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,sans-serif; font-size: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: &amp;quot;Times New Roman&amp;quot;; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;"&gt;Regards,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin: 0cm 0cm 0pt;"&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,sans-serif; font-size: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: &amp;quot;Times New Roman&amp;quot;; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;"&gt;Konstantin&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Sep 2016 02:42:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-choose-the-block-size-and-process-number/m-p/1063889#M21832</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2016-09-08T02:42:07Z</dc:date>
    </item>
  </channel>
</rss>

