Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Intel MPI for Phi tuning tips?

Ron_Green
Moderator
485 Views

Does setting

    I_MPI_MIC=enable

change other MPI environment variables, particularly any that would tune MPI for the MIC system architecture?  

As a side question, has anyone written a Tuning and Tweaking guide for IMPI for Phi?  For example, what I_MPI variables could one use to help tune an app targeting 480 ranks across 8 Phis?

Thanks

Ron

0 Kudos
3 Replies
jimdempseyatthecove
Honored Contributor III
485 Views

Are the MPI processes single threaded?

If yes, then you should realize that 480 ranks over 8 Phis results in one thread per core (assuming you somehow restrict one process per core).

With Xeon Phi, a second hardware thread running within a core is almost free. Therefore, consider using 960 ranks over 8 Phis (also try 1440).

While I haven't done this, you might try I_MPI_PIN_DOMAIN=core, or, I_MPI_PIN_DOMAIN=480:scatter

I am not sure how this applies when you have multiple MIC's (as to if this is also multiple nodes).

What you asked for (480 ranks) is one process per core across 8 MICs.

Tim Prince may be able to answer this better.

Jim Dempsey

0 Kudos
TimP
Honored Contributor III
485 Views

Intel mpi defaults are generally effective. A second or 3 threads per  core are well worth while if done by openmp or equivalent inside mpi with cache locality

0 Kudos
Ron_Green
Moderator
485 Views

good point - for Phi you need 2 or more threads per core to get peak out of the core.  if the MPI processes are single threaded, the performance on Phi may be disappointing.

ron

0 Kudos
Reply