- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using the Trace Analyzer for an MPI job running on 4 nodes (80 physical cores total, 80 MPI threads).
When I run 'mpirun -trace ...' the job takes roughly 10 times longer than the same job running without tracing because processes are being suspended when trace routines dump data from memory to disk.
My goal is to identify the relative share of time spent in most active MPI routines but, In this situation, how much one can trust timing statistics displayed by the Trace Analyzer? Isn't it possible that a process starts an MPI_SEND when the target process is suspended by the tracer, so that the transfer does not start until the tracer in the target process has completed the dump and all this artificial wait time is added by the Trace Analyzer to the time spent in the MPI_SEND call?
Thank you for your attention.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Michael,
If all you're looking for is most active MPI routines and time spent in them, I would recommend using the MPI Performance Snapshot which is much more lightweight. Here's a quick getting started guide. Just make sure you're using latest versions of our tools.
Once the info is collected, running with "mps -f <stats_file> <app_stats_file>" will give you all the "hotspot" MPI routines in the code.
Once you've narrowed down where the problem areas are, I would use the Intel Trace Analyzer and Collector with some filtering applied to reduce the amount of data collected and any potential performance impacts.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Gergana, I'll try MPS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gergana,
I am trying to use MPS but there is a problem. Here is how I run the job:
source /opt/lic/intel16U2/vtune_amplifier_xe/amplxe-vars.sh source /opt/lic/intel16U2/itac/9.1.2.024/intel64/bin/mpsvars.sh source /opt/lic/intel16U2/parallel_studio_xe_2016.2.062/bin/psxevars.sh source /opt/lic/intel16U2/bin/compilervars.sh intel64 source \ /opt/lic/intel16U2/compilers_and_libraries_2016.2.181/linux/mpi/intel64/bin/mpivars.sh mpirun -mps \ -genv I_MPI_STATS 10 \ -genv I_MPI_STATS_SCOPE all \ -genv I_MPI_STATS_FILE mpi_stats_file_2_1.txt \ -genv I_MPI_PIN 1\ -genv I_MPI_PIN_PROCS allcores:map=scatter \ -genv I_MPI_PIN_MODE mpd \ -genv I_MPI_DEBUG 0 \ -np 80 \ --rsh=ssh \ --file=mpd.hosts \ /home/ashv/draco_revs/draco_wiindigoo_blizzard_maxMagXBTeffect/draco_cl5_16
When the programs tries to execute MPI_Allgatherv(), I receive multiple error messages as follows:
Fatal error in PMPI_Allgatherv: Invalid datatype, error stack: PMPI_Allgatherv(1483): MPI_Allgatherv(sbuf=0x1040d44, scount=0, INVALID DATATYPE, rbuf=0x7fff8e4b2010, rcounts=0x7fff8e5163b0, displs=0x7fff8e5164f0, MPI_DOUBLE_PRECISION, MPI_COMM_WORLD) failed PMPI_Allgatherv(1393): Invalid datatype
When I run the same job without the -mps parameter, the job runs without errors and produces correct results.
The program is compiled in the exactly same environment as shown above. All compilers and tools are the latest available copies.
Thank you for your attention.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Michael,
It seems like MPS doesn't like the datatype you're passing through your MPI_Allgatherv(). Can you tell me what it is? I don't know of a particular problem with datatypes but I'll check with the team.
How large is this application? Might be good to have a local reproducer so I can run it on one of our machines. Would it be possible for you to send the code over? Or a small sample that exhibits the problem?
Thanks,
~Gergana
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page