<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Measure performance MPI + openMP in Analyzers</title>
    <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1287810#M20630</link>
    <description>&lt;P&gt;There is a program written with MPI and openMP, and I want to measure the performance of MPI interfaces.&lt;/P&gt;
&lt;P&gt;I tried vtune hotspots to profile it,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;export OMP_NUM_THREAD=6

cat &amp;gt;vtune.conf &amp;lt;&amp;lt;EOF
0-34     ./app
35       amplxe-cl -collect hotspots -no-follow-child  -trace-mpi -r result -- ./app
EOF

srun -N 6 -n 36 --multi-prog vtune.conf&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the profiling results, the hotspots functions are &amp;nbsp;&lt;/P&gt;
&lt;P&gt;opal_timer_base_get_usec_sys_timer&lt;/P&gt;
&lt;P&gt;&lt;A href="mailto:func@01abcc" target="_blank"&gt;func@xxxx&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;_pthread&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;those results are not what I intended, I want to get the results of MPI interfaces, what should I tweak to get the right hotspots? Should I use a different tool?&lt;/P&gt;</description>
    <pubDate>Mon, 07 Jun 2021 19:28:00 GMT</pubDate>
    <dc:creator>i9</dc:creator>
    <dc:date>2021-06-07T19:28:00Z</dc:date>
    <item>
      <title>Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1287810#M20630</link>
      <description>&lt;P&gt;There is a program written with MPI and openMP, and I want to measure the performance of MPI interfaces.&lt;/P&gt;
&lt;P&gt;I tried vtune hotspots to profile it,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;export OMP_NUM_THREAD=6

cat &amp;gt;vtune.conf &amp;lt;&amp;lt;EOF
0-34     ./app
35       amplxe-cl -collect hotspots -no-follow-child  -trace-mpi -r result -- ./app
EOF

srun -N 6 -n 36 --multi-prog vtune.conf&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the profiling results, the hotspots functions are &amp;nbsp;&lt;/P&gt;
&lt;P&gt;opal_timer_base_get_usec_sys_timer&lt;/P&gt;
&lt;P&gt;&lt;A href="mailto:func@01abcc" target="_blank"&gt;func@xxxx&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;_pthread&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;those results are not what I intended, I want to get the results of MPI interfaces, what should I tweak to get the right hotspots? Should I use a different tool?&lt;/P&gt;</description>
      <pubDate>Mon, 07 Jun 2021 19:28:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1287810#M20630</guid>
      <dc:creator>i9</dc:creator>
      <dc:date>2021-06-07T19:28:00Z</dc:date>
    </item>
    <item>
      <title>Re:Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1288080#M20635</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;Hi&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;Thank you for posting in Intel Forums. You could make the below corrections to your commands.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px; font-family: Consolas, &amp;quot;Bitstream Vera Sans Mono&amp;quot;, &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;export OMP_NUM_THREADS=12&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px; font-family: Consolas, &amp;quot;Bitstream Vera Sans Mono&amp;quot;, &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;mpirun -n 16 –ppn 4 –l vtune -collect hotspots -k sampling-mode=hw -trace-mpi -result-dir &amp;lt;result directory path&amp;gt; -- ./app&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;If a MPI application is launched on multiple nodes,&amp;nbsp;VTune&amp;nbsp;Profiler&amp;nbsp;creates a number of result directories per compute node in the current directory encapsulating the data for all the ranks running on the node in the same directory.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;Please refer the below link to know the utilization of both the tools with MPI, Intel Advisor and Intel VTune profiler, &lt;/SPAN&gt;&lt;SPAN style="font-size: 14px; font-family: intel-clear, tahoma, Helvetica, helvetica, Arial, sans-serif;"&gt;to collect performance data at the node and core level&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;&lt;A href="https://software.intel.com/content/www/us/en/develop/articles/using-intel-advisor-and-vtune-amplifier-with-mpi.html" target="_blank"&gt;https://software.intel.com/content/www/us/en/develop/articles/using-intel-advisor-and-vtune-amplifier-with-mpi.html&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;A separate tool exists to record the details of the communication patterns and communication costs of an MPI application, the Intel® Trace Analyzer and Collector. The information provided by VTune™ Amplifier and Intel® Advisor is focused on the core and node performance, and complements the specific MPI communication details provided by the &lt;/SPAN&gt;&lt;A href="https://software.intel.com/en-us/intel-trace-analyzer" rel="noopener noreferrer" target="_blank" style="font-size: 14px;"&gt;Intel® Trace Analyzer and Collector&lt;/A&gt;&lt;SPAN style="font-size: 14px;"&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Please check this and let us know if this works.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Alekhya&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 08 Jun 2021 12:22:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1288080#M20635</guid>
      <dc:creator>AlekhyaV_Intel</dc:creator>
      <dc:date>2021-06-08T12:22:15Z</dc:date>
    </item>
    <item>
      <title>Re: Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1288083#M20636</link>
      <description>&lt;P&gt;Thank you very much for your reply.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The version of VTune is `&lt;SPAN&gt;Intel(R) VTune(TM) Amplifier 2018 Update 4 (build 574913) Command Line Tool`, so I would use amplxe-cl instead.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;May I show you more details for this experiment? There are 6 MPI procs launched on each node and on the last single node, 1 MPI proc is utilized to perform parallel io. I want to focus more on the performance of this parallel io MPI proc. Is measuring all nodes necessary?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bests&lt;/P&gt;</description>
      <pubDate>Tue, 08 Jun 2021 12:41:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1288083#M20636</guid>
      <dc:creator>i9</dc:creator>
      <dc:date>2021-06-08T12:41:04Z</dc:date>
    </item>
    <item>
      <title>Re:Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1289180#M20658</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;Hi,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;Profiling runs in Vtune would include profiling all the MPI processes inside a node, or a single MPI process on each node. You could use selective profiling to reduce the size of the results collected by VTune Profiler. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;We have provided a command to profile a single mpi process after building the application. Among the 64 processes, 16 processes are allocated per node. Using option "-gtool" &amp;nbsp;you could launch tools such as Intel® VTune profiler, Intel Advisor, and GNU Debugger (GDB) for the specified processes through the mpiexec.hydra and mpirun commands and profile the required process.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;mpirun -n 64 -ppn 16 -gtool "vtune -collect &amp;lt;analysis-type&amp;gt; -r &amp;lt;result-dir&amp;gt; :0-15" ./a.out "vtune -collect &amp;lt;analysis-type&amp;gt; -r &amp;lt;result-dir&amp;gt; :5" ./a.out "vtune -collect &amp;lt;analysis-type&amp;gt; -r &amp;lt;result-dir&amp;gt; :0-15" ./a.out -gtool "vtune -collect &amp;lt;analysis-type&amp;gt; -r &amp;lt;result-dir&amp;gt; :7" ./a.out&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Please try this and let us know if this works.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Alekhya&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 11 Jun 2021 19:53:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1289180#M20658</guid>
      <dc:creator>AlekhyaV_Intel</dc:creator>
      <dc:date>2021-06-11T19:53:05Z</dc:date>
    </item>
    <item>
      <title>Re:Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1291591#M20776</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Is your issue resolved? Could you give us an update?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Alekhya&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 21 Jun 2021 08:29:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1291591#M20776</guid>
      <dc:creator>AlekhyaV_Intel</dc:creator>
      <dc:date>2021-06-21T08:29:53Z</dc:date>
    </item>
    <item>
      <title>Re:Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1293912#M20834</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We assume that your issue is resolved. If you need any further assistance, please post a new question as this thread will no longer be monitored.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Alekhya&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 28 Jun 2021 12:20:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1293912#M20834</guid>
      <dc:creator>AlekhyaV_Intel</dc:creator>
      <dc:date>2021-06-28T12:20:37Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Measure performance MPI + openMP</title>
      <link>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1293940#M20836</link>
      <description>&lt;P&gt;Hello,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you need exact timing for MPI functions it makes sense to use tools based on MPI instrumentation like APS &amp;lt;VTune_install_dir/bin64/&amp;gt;aps or ITAC. VTune sampling approach usually does not work well with this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards, Dmitry&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jun 2021 13:17:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Measure-performance-MPI-openMP/m-p/1293940#M20836</guid>
      <dc:creator>Dmitry_P_Intel1</dc:creator>
      <dc:date>2021-06-28T13:17:47Z</dc:date>
    </item>
  </channel>
</rss>

