I want to generate timing log on mpi functions. I am using "export I_MPI_STATS=20" to enable log. This is capturing timing info only on one node. How to get similar information from all nodes that are used in execution.
When I use this feature, I get a single file (presumably created on the node running the master MPI task) with information for all MPI ranks in the job. You will need external information (such as hostfiles) to know which node each MPI rank actually ran on.
It seems whenever stat is enabled, MPI_Finalize call is failing and because of that log info is not generated for all nodes. Any guess for why MPI_Finalize is failing only when stat is enabled?
MPI is mostly a mystery to me.... Maybe an Intel person can comment....
What sort of information are you getting in the stats.txt file? The output I see has one section per MPI rank, but each section includes information about the communication of that rank with all the other ranks, so I don't see how the output can even get started unless all the ranks have communicated their stats....