Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Slowdown p2p MPI calls

unrue
Beginner
659 Views

Dear MPi users,

I'm using IntelMPI cs-2011. My code (OpenMP + MPI)  does for each time step some send and receive MPI calls after a kernel computation. MPI calls are used for ghost cell exchange. (few kbytes)

I've noted a significative slowdown during the computation. I suppose the problem is in some low level MPI setting because by using OpenMPI that problem disappear. I'm using Inifiniband and 12 cores on 1 node, so just intranode communication is used.

I disabled shared memory inside a node, used dapl for intranode, decreased I_MPI_INTRANODE_THRESHOLD, set I_MPI_DAPL_TRANSLATION_CACHE to 0, without any good improvement. 

Do you have an idea because the p2p calls slowdown in running?

Thanks a lot. 

0 Kudos
3 Replies
Gergana_S_Intel
Employee
659 Views

Hey unrue,

Thanks for posting.  Unfortunately, performance issues are notoriously hard to track down.  Based on your original post, can I assume you're mostly using p2p messages in your application?

The first thing I would suggest is grabbing the latest Intel® MPI Library and giving that a try.  We have Intel MPI Library 4.1 Update 1 that was released not too long ago.  The beauty of it is you can install the new runtimes and re-run your application without having to recompile (Intel MPI 4.0 - which is probably what you have - is binary compatible with Intel MPI 4.1 - the latest).

You can grab the latest package from the Intel® Registration Center - just login using your e-mail address and the password you created when you originally downloaded the library.

Ideally, we'd like to have a reproducer that we can test out locally.  If that's not possible, can you provide some debug output (I_MPI_DEBUG=5) when running your application, as well as the full set of env variables you're setting?  What's the nature of the performance slowdown between Intel MPI and OpenMPI - 10% or 90% slowdown?

Looking forward to hearing back soon.

Regards,
~Gergana

0 Kudos
unrue
Beginner
659 Views

Hi Georgana,

thanks for your reply. Mostly I use MPI_Send and MPI_Receive. I attach a profile of one process of 12. The performance slowdown compared to OpenMPI is about 50%.

0 Kudos
unrue
Beginner
659 Views

Gergana Slavova (Intel) wrote:

The first thing I would suggest is grabbing the latest Intel® MPI Library and giving that a try.  We have Intel MPI Library 4.1 Update 1 that was released not too long ago.  The beauty of it is you can install the new runtimes and re-run your application without having to recompile (Intel MPI 4.0 - which is probably what you have - is binary compatible with Intel MPI 4.1 - the latest).

Regards,

~Gergana


Dear Georgana,

I tried latest IntelMPi version as you suggested, but the problem still remain .. :(

0 Kudos
Reply