<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to run HPL on one node with 1 process on 1 core? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1357603#M32706</link>
    <description>&lt;P&gt;Hello all!&lt;/P&gt;
&lt;P&gt;My testing platform is a single node with two sockets. Each socket has an&amp;nbsp;Intel Xeon CPU E5-2697 v2 @ 2.70GHz which has 12 cores and 24 threads. Below is my software environment.&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="屏幕截图 2022-02-05 092323.png" style="width: 786px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/26314i2803A88FF85804E4/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="屏幕截图 2022-02-05 092323.png" alt="屏幕截图 2022-02-05 092323.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;I definitely know that one MPI process one socket is recommended, i.e. P x Q = 2. But I want to try and see the performance of 1 process on 1 core, which means P x Q = 24. Below is the HPL testbench in my server.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="屏幕截图 2022-02-05 092606.png" style="width: 999px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/26315i3EE006E999FCB67F/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="屏幕截图 2022-02-05 092606.png" alt="屏幕截图 2022-02-05 092606.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;I commonly run "runme_intel64_dynamic" with MPI_PROC_NUM = 2 and MPI_PER_NODE = 2. I want to know that how to set environment variables to ensure 1 process on 1 core.&lt;/P&gt;
&lt;P&gt;By the way, I find that when running HPL, only physical cores are used. Does that means I don't need to care about HT? And will&amp;nbsp;&lt;SPAN&gt;MKL HPL itself spawn threads to the max 24 cores?&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 05 Feb 2022 01:34:53 GMT</pubDate>
    <dc:creator>PRE_MITER</dc:creator>
    <dc:date>2022-02-05T01:34:53Z</dc:date>
    <item>
      <title>How to run HPL on one node with 1 process on 1 core?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1357603#M32706</link>
      <description>&lt;P&gt;Hello all!&lt;/P&gt;
&lt;P&gt;My testing platform is a single node with two sockets. Each socket has an&amp;nbsp;Intel Xeon CPU E5-2697 v2 @ 2.70GHz which has 12 cores and 24 threads. Below is my software environment.&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="屏幕截图 2022-02-05 092323.png" style="width: 786px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/26314i2803A88FF85804E4/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="屏幕截图 2022-02-05 092323.png" alt="屏幕截图 2022-02-05 092323.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;I definitely know that one MPI process one socket is recommended, i.e. P x Q = 2. But I want to try and see the performance of 1 process on 1 core, which means P x Q = 24. Below is the HPL testbench in my server.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="屏幕截图 2022-02-05 092606.png" style="width: 999px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/26315i3EE006E999FCB67F/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="屏幕截图 2022-02-05 092606.png" alt="屏幕截图 2022-02-05 092606.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;I commonly run "runme_intel64_dynamic" with MPI_PROC_NUM = 2 and MPI_PER_NODE = 2. I want to know that how to set environment variables to ensure 1 process on 1 core.&lt;/P&gt;
&lt;P&gt;By the way, I find that when running HPL, only physical cores are used. Does that means I don't need to care about HT? And will&amp;nbsp;&lt;SPAN&gt;MKL HPL itself spawn threads to the max 24 cores?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 05 Feb 2022 01:34:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1357603#M32706</guid>
      <dc:creator>PRE_MITER</dc:creator>
      <dc:date>2022-02-05T01:34:53Z</dc:date>
    </item>
    <item>
      <title>Re: How to run HPL on one node with 1 process on 1 core?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1358028#M32710</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for reaching out to us.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please try using the below&amp;nbsp;command for running the HPL benchmark on 1 core.&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;HPL_HOST_CORE=0 ./runme_intel64_dynamic&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For more information refer to the below link:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/intel-oneapi-math-kernel-library-benchmarks/intel-distribution-for-linpack-benchmark-1/environment-variables.html" target="_blank" rel="noopener"&gt;https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/intel-oneapi-math-kernel-library-benchmarks/intel-distribution-for-linpack-benchmark-1/environment-variables.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since you want to run the Intel® Distribution for LINPACK* Benchmark binary "runme_intel64_dynamic" using 1 process on 1 core, could you please explain the use-case/intension behind it? So that it helps us to understand your scenario better and thus helps us to provide you better support.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;gt;"And will MKL HPL itself spawn threads to the max 24 cores?"&lt;/P&gt;
&lt;P&gt;Could you please elaborate on the above statement?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards&lt;/P&gt;
&lt;P&gt;Hemanth.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 07 Feb 2022 12:49:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1358028#M32710</guid>
      <dc:creator>HemanthCH_Intel</dc:creator>
      <dc:date>2022-02-07T12:49:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to run HPL on one node with 1 process on 1 core?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1358392#M32714</link>
      <description>&lt;P&gt;Thanks for the response! Maybe I haven't expressed it clear what I mean. Sorry about that.&lt;/P&gt;
&lt;P&gt;I want to compare the performance of HPL between the following two running ways. One is to run 2 MPI processes, i.e. one process on one socket. The other is to run 24 MPI processes, i.e. one process on one core. Though I know the later way is not recommended, I just want to know how much performance loss there will be.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As for "&lt;SPAN&gt;&amp;gt;&amp;gt;And will MKL HPL itself spawn threads to the max 24 cores?", I find this on the other topic "Need Help w/ MKL HPL scores and HPL.dat"(link:&amp;nbsp;&lt;A href="https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Need-Help-w-MKL-HPL-scores-and-HPL-dat/m-p/1093958#M23432%3Fwapkw=HPL" target="_blank"&gt;Need Help w/ MKL HPL scores and HPL.dat - Intel Communities&lt;/A&gt;). I also attached the related picture below:&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="屏幕截图 2022-02-08 155142_LI.jpg" style="width: 999px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/26402i161761DBDE6D1475/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="屏幕截图 2022-02-08 155142_LI.jpg" alt="屏幕截图 2022-02-08 155142_LI.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;About this question, I want to know how to check and change the number of HPL threads on each process. Or does MKL HPL will itself spawn threads as much as possible?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thanks,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt; John&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Feb 2022 08:06:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1358392#M32714</guid>
      <dc:creator>PRE_MITER</dc:creator>
      <dc:date>2022-02-08T08:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to run HPL on one node with 1 process on 1 core?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1360061#M32740</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;"I want to know that how to set environment variables to ensure 1 process on 1 core."&lt;/P&gt;
&lt;P&gt;If you want to run 1 process on 1 core, could you please follow the below steps?&lt;/P&gt;
&lt;P&gt;1)create a file called sample&lt;/P&gt;
&lt;P&gt;#!/bin/bash&lt;/P&gt;
&lt;P&gt;export HPL_HOST_CORE=${PMI_RANK}&lt;/P&gt;
&lt;P&gt;./runme_intel64_dynamic&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2)add the executable permission to that file.&lt;/P&gt;
&lt;P&gt;chmod +x sample&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3)run the below command:&lt;/P&gt;
&lt;P&gt;mpirun -n &amp;lt;no.of process&amp;gt; ./sample&lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;gt;"&lt;SPAN&gt;Does that means I don't need to care about HT"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;In multi-processor systems, best performance will be obtained with the Intel® Hyper-Threading Technology turned off, which ensures that the operating system assigns threads to physical processors only. For more information refer to the below link:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;A href="https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/intel-oneapi-math-kernel-library-benchmarks/intel-optimized-linpack-benchmark-for-linux/limits-of-the-intel-optimized-linpack-benchmark.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/intel-oneapi-math-kernel-library-benchmarks/intel-optimized-linpack-benchmark-for-linux/limits-of-the-intel-optimized-linpack-benchmark.html&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;"does MKL HPL will itself spawn threads as much as possible?"&lt;BR /&gt;MKL HPL spawns multiple threads to exploit multi/many-cores, and it does not require running 1 MPI process per core.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Hemanth&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 05 Mar 2022 12:01:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1360061#M32740</guid>
      <dc:creator>HemanthCH_Intel</dc:creator>
      <dc:date>2022-03-05T12:01:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to run HPL on one node with 1 process on 1 core?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1361114#M32751</link>
      <description>&lt;P&gt;Thanks a lot for the response!&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2022 05:51:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1361114#M32751</guid>
      <dc:creator>PRE_MITER</dc:creator>
      <dc:date>2022-02-17T05:51:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to run HPL on one node with 1 process on 1 core?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1361157#M32752</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Hemanth&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 05 Mar 2022 13:13:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-run-HPL-on-one-node-with-1-process-on-1-core/m-p/1361157#M32752</guid>
      <dc:creator>HemanthCH_Intel</dc:creator>
      <dc:date>2022-03-05T13:13:15Z</dc:date>
    </item>
  </channel>
</rss>

