compute - communication breakup for multinode MPMD jobs

psing51 · ‎12-28-2021

Hi,
I have a MPMD job (mpirun -np 200 ./app1: -np 120 ./app2 : -np 80 ./app3) and i plan to compare the MPI behaviour of the job over 10G ethernet network and infiniband network. Here are few queries -

1)
APS is one tool which comes to my mind which is capable of generating summary of MPI without needing recompilation of executables (with -g). is this statement correct?

2)
Also, i plan to use get the data via aps tool as -
mpirun -np 200 aps ./app1: -np 120 aps ./app2 : -np 80 aps ./app3
Is this correct way to use aps tool for MPMD runs?

3) The application takes ~4 hours (without aps), will the runtime increase due to overhead by aps ?
also, is there a way to customize the data collection time and file size of collection ? example -
profile for entire duration of run/ profile only 1st hour of run, or profile till the profiling data is 100G (due to disk space limitation).

4) unrelated to this post - I have uploaded few files for this query https://community.intel.com/t5/Intel-oneAPI-HPC-Toolkit/intel-mpi-error-line-1334-cma-read-nbytes-size/td-p/1329220
Is it possible to activate this thread or i need to create new post on same topic?

SantoshY_Intel · ‎12-29-2021

Hi,

Thank you for posting in Intel Communities.

>>"APS is one tool which comes to my mind which is capable of generating summary of MPI without needing recompilation of executables (with -g). is this statement correct?"

Yes, we need not recompile the executables with "-g" option in order to use aps tool.

For more information refer to the below link:

https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-application-performance-snapshot/top.html#top_GUID-60971A4A-D94C-47F1-8E20-B8D51AE8E0A7

>>"Is this correct way to use aps tool for MPMD runs?"

Yes, it is the correct way to use the APS tool for MPMD job runs.

>>"will the runtime increase due to overhead by aps ?"

Yes, we can expect an increase in runtime as the APS tool takes time to collect the data.

>>" is there a way to customize the data collection time and file size of the collection?"

We can control the amount of collected data which enables you to reduce profiling overhead and focus on relevant application sections.

Please refer to the below link for more information:

https://www.intel.com/content/www/us/en/develop/documentation/application-snapshot-user-guide/top/analyzing-applications/controlling-amount-of-collected-data.html

>>"Is it possible to activate this thread or i need to create new post on same topic?"

As the thread was closed, it will no longer be monitored by Intel. For further investigation, please post a new question referring to the URL of the old query.

Thanks & Regards,

Santosh

psing51 · ‎01-04-2022

Thank you for the reply, Here are few more queries -
a) do we need sudo permission/privileges to profile a job using intel APS/Vtune ?
b) Do we need to load the sep driver on all the compute nodes (insmod-sep -r) for multinode profiling to work correctly?
c) We have infiniband interconnect on the cluster, is there a way to see the aggregated MPI throughput (/data transfer rate) and per-node throughput using the aps tool? .

SantoshY_Intel · ‎01-06-2022

Hi,

To answer your queries, we need more information from your side. So, could you please provide the below details?

Operating system being used.
The version of Intel oneAPI HPC Toolkit & Intel oneAPI Base Toolkit.

Thanks & Regards,

Santosh

psing51 · ‎01-11-2022

1. RHEL 7.9
2. inteloneapi/2021.3.0
I was able to explore the answers for my previous queries.

I have a both ib and ethernet (1 gig) interface available on our cluster and while trying the MPMD profiling , here are the overhead i noticed -
1 node : with profiling ~5 hours (50% of elapsed time in MPI) , without profiling - ~5 hours

2 node : with profiling ~10 hours (85% of elapsed time in MPI ), without profiling ~3 hours.

for 2 node , Is the slowdown of this magnitude expected ? If this is expected with aps + ethernet then could you please share some recommendations with which the aps overheads can be reduced.

SantoshY_Intel · ‎01-19-2022

Hi,

Thanks for reporting us.

We have reported this issue to the concerned development team. They are looking into your issue.

Thanks & Regards,

Santosh

SantoshY_Intel · ‎01-27-2022

Hi,

>>" Is the slowdown of this magnitude expected ? If this is expected with aps + ethernet then could you please share some recommendations with which the aps overheads can be reduced"

To debug and investigate more on your issue, could you please provide us with the sample reproducer codes for app1, app2 & app3 applications that you are using for launching the MPMD job in the below command?

mpirun -np 200 aps  ./app1: -np 120  aps ./app2 : -np 80  aps ./app3

Thanks & Regards,

Santosh

SantoshY_Intel · ‎02-02-2022

Hi,

We haven't heard back from you. For further investigation of your issue, could you please provide us with the sample reproducer codes for app1, app2 & app3 applications that you are using for launching the MPMD job?

Thanks & Regards,

Santosh

SantoshY_Intel · ‎03-02-2022

Hi,

We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.

Thanks & Regards,

Santosh