- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They are used to get a code that my user has running on an XT. I know there
are issues with the code, but if the options exist in Intel MPI, it would get us
going faster. I have looked through the Intel MPI 4.0 Beta manual, and didn't
find any equivalents, but I am asking the question in case I missed the option
or there are some undocumented options that could help:
1) MPICH_MAX_SHORT_MSG_SIZE - Determines when to use Eager vs. Rendevous protocol,
does this just sound like I_MPI_RDMA_EAGER_THRESHOLD?
2) MPICH_PTL_UNEX_EVENTS - Define the total number of unexpected events allowed
3) MPICH_UNEX_BUFFER_SIZE - Set the buffer size for the unexpected events
4) MPICH_ENV_DISPLAY - Display all settings used by the MPI
Options 2 and 3 are the most important, as I believe the code path that hangs
when run with 2000+ cores is sending too many unexpected events.
Thanks,
Craig
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Craig,
does this just sound like I_MPI_RDMA_EAGER_THRESHOLD?
Indeed, you're exactly right. I_MPI_EAGER_THRESHOLD
(without the RDMA in the name) sets the cutoff valuebetween using the eager or rendezvous protocols for all devices. The default is ~260KB - any messages shorter or equal to that will use eager, any messages larger will use rendezvous.
3) MPICH_UNEX_BUFFER_SIZE - Set the buffer size for the unexpected events
You can take a look at the description for the I_MPI_DAPL_CONN_EVD_SIZE
env variable. This is used to define the size of the event queue. The default value is [2*(#procs) + 32] but you can go ahead and try increasing it. Reading the description for MPICH_PTL_UNEX_EVENTS, it seemed to be the most related.
Alternatively, when you say "unexpected events", it makes me think you have some issue scaling out using OFED - is that correct? In this case, simply updating to the latest DAPL drivers should help. What OFED and/or DAPL versions do you have installed?
If you've upgraded to OFED 1.4.1, it contains the new Socket CM (scm) provider instead of the existing cma one (e.g. OpenIB-cma
). The new one handles scalability a lot better so you can give that a try. Again, this is just speculation on my part, since I'm not sure what errors you're really getting.
Set I_MPI_DEBUG=1001
- this is the highest value possilble for the library. At the startup of the job, Intel MPI Library will print out all env variables it's using.
I hope this helps. Let us know how it goes or if you have further questions (or if I misunderstood any of your questions).
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Craig,
Indeed, you're exactly right. I_MPI_EAGER_THRESHOLD
(without the RDMA in the name) sets the cutoff valuebetween using the eager or rendezvous protocols for all devices. The default is ~260KB - any messages shorter or equal to that will use eager, any messages larger will use rendezvous.
3) MPICH_UNEX_BUFFER_SIZE - Set the buffer size for the unexpected events
You can take a look at the description for the I_MPI_DAPL_CONN_EVD_SIZE
env variable. This is used to define the size of the event queue. The default value is [2*(#procs) + 32] but you can go ahead and try increasing it. Reading the description for MPICH_PTL_UNEX_EVENTS, it seemed to be the most related.
Alternatively, when you say "unexpected events", it makes me think you have some issue scaling out using OFED - is that correct? In this case, simply updating to the latest DAPL drivers should help. What OFED and/or DAPL versions do you have installed?
If you've upgraded to OFED 1.4.1, it contains the new Socket CM (scm) provider instead of the existing cma one (e.g. OpenIB-cma
). The new one handles scalability a lot better so you can give that a try. Again, this is just speculation on my part, since I'm not sure what errors you're really getting.
Set I_MPI_DEBUG=1001
- this is the highest value possilble for the library. At the startup of the job, Intel MPI Library will print out all env variables it's using.
I hope this helps. Let us know how it goes or if you have further questions (or if I misunderstood any of your questions).
Regards,
~Gergana
Sorry for the delay in response. These information should help out.
As far as #3, it isn't a scalability issue. The MPI code does not post its receives before the sends, and as I have been told by the experts that this causes MPI to use the unexpected buffers to store the messages. If there are too many, then things go bad. The user solve the problem on the Cray by increasing that buffer to 126MB and the number of events (#2) to 81920.
Really, the code is broke, but there are other problems in the code where these settings have solved the problems. I will pass on the information for testing.
Craig
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page