Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2159 Discussions

compute - communication breakup for multinode MPMD jobs

psing51
New Contributor I
1,582 Views

Hi,
I have a MPMD job (mpirun -np 200 ./app1: -np 120 ./app2 : -np 80 ./app3) and i plan to compare the MPI behaviour of the job over 10G ethernet network and infiniband network. Here are few queries - 

1)
APS is one tool which comes to my mind which is capable of generating summary of MPI without needing recompilation of executables (with -g).  is this statement correct?

2)
Also, i plan to use get the data via aps tool as - 
mpirun -np 200 aps  ./app1: -np 120  aps ./app2 : -np 80  aps ./app3
Is this correct way to use aps tool for MPMD  runs?

3) The application takes ~4 hours (without aps), will the runtime increase due to overhead by aps ?
also, is there a way to customize the data collection time and file size of collection ? example  -
profile for entire duration of run/ profile only 1st hour  of run, or profile till the profiling data is 100G (due to disk space limitation).

 

4) unrelated to this post - I have uploaded few files for this query https://community.intel.com/t5/Intel-oneAPI-HPC-Toolkit/intel-mpi-error-line-1334-cma-read-nbytes-size/td-p/1329220
Is it possible to activate this thread or i need to create new post on same topic?

0 Kudos
8 Replies
SantoshY_Intel
Moderator
1,560 Views

Hi,

 

Thank you for posting in Intel Communities.

 

>>"APS is one tool which comes to my mind which is capable of generating summary of MPI without needing recompilation of executables (with -g). is this statement correct?"

Yes, we need not recompile the executables with "-g" option in order to use aps tool.

For more information refer to the below link:

https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-application-performance-snapshot/top.html#top_GUID-60971A4A-D94C-47F1-8E20-B8D51AE8E0A7

 

>>"Is this correct way to use aps tool for MPMD runs?"

Yes, it is the correct way to use the APS tool for MPMD job runs.

 

>>"will the runtime increase due to overhead by aps ?"

Yes, we can expect an increase in runtime as the APS tool takes time to collect the data.

 

>>" is there a way to customize the data collection time and file size of the collection?"

We can control the amount of collected data which enables you to reduce profiling overhead and focus on relevant application sections.

Please refer to the below link for more information:

https://www.intel.com/content/www/us/en/develop/documentation/application-snapshot-user-guide/top/analyzing-applications/controlling-amount-of-collected-data.html

 

>>"Is it possible to activate this thread or i need to create new post on same topic?"

As the thread was closed, it will no longer be monitored by Intel. For further investigation, please post a new question referring to the URL of the old query.

 

Thanks & Regards,

Santosh

 

 

 

0 Kudos
psing51
New Contributor I
1,538 Views

Thank you for the reply, Here are few more queries - 
a) do we need sudo permission/privileges to profile a job using intel APS/Vtune ? 
b) Do we need to load the sep driver on all the compute nodes (insmod-sep -r) for multinode profiling to work correctly?
c) We have infiniband interconnect on the cluster, is there a way to see the aggregated MPI throughput (/data transfer rate) and per-node throughput  using the aps tool?  .


0 Kudos
SantoshY_Intel
Moderator
1,520 Views

Hi,


To answer your queries, we need more information from your side. So, could you please provide the below details?

  1. Operating system being used.
  2. The version of Intel oneAPI HPC Toolkit & Intel oneAPI Base Toolkit.


Thanks & Regards,

Santosh


0 Kudos
psing51
New Contributor I
1,503 Views

1.  RHEL 7.9
2. inteloneapi/2021.3.0
I was able to explore the answers for my previous queries.

 


I have a both ib and ethernet (1 gig) interface available on our cluster and while trying the MPMD profiling , here are the overhead i noticed - 
1 node :  with profiling ~5 hours (50% of elapsed time in  MPI) , without profiling - ~5 hours  

2 node :  with profiling ~10 hours (85% of elapsed time in MPI ), without profiling ~3 hours.


for 2 node , Is the slowdown of this magnitude expected ? If this is expected with aps + ethernet then could you please share some recommendations with which the aps overheads can be reduced.

0 Kudos
SantoshY_Intel
Moderator
1,447 Views

Hi,

 

Thanks for reporting us.

 

We have reported this issue to the concerned development team. They are looking into your issue.

 

Thanks & Regards,

Santosh

 

0 Kudos
SantoshY_Intel
Moderator
1,402 Views

Hi,

 

>>" Is the slowdown of this magnitude expected ? If this is expected with aps + ethernet then could you please share some recommendations with which the aps overheads can be reduced"

To debug and investigate more on your issue, could you please provide us with the sample reproducer codes for app1, app2 & app3 applications that you are using for launching the MPMD job in the below command? 

mpirun -np 200 aps  ./app1: -np 120  aps ./app2 : -np 80  aps ./app3

 

Thanks & Regards,

Santosh

 

0 Kudos
SantoshY_Intel
Moderator
1,385 Views

Hi,


We haven't heard back from you. For further investigation of your issue, could you please provide us with the sample reproducer codes for app1, app2 & app3 applications that you are using for launching the MPMD job?


Thanks & Regards,

Santosh


0 Kudos
SantoshY_Intel
Moderator
1,346 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.


Thanks & Regards,

Santosh


0 Kudos
Reply