Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

Hybrid MPI/OpenMP showing poor performance when using all cores on a socket vs. N-1 cores per socket

Gandharv_K_
Beginner
867 Views

Hi,

I'm running a hydrid MPI/OpenMP application on Intel® E5-2600v3 (Haswell) Series cores and I notice a drop in performance by 40% when using all cores (N) on a socket vs. N-1 cores on a socket. This behavior is pronounced with higher core counts >= 160 cores. The cluster is built using 2 CPUs per node. As a test case I tried a similar run on Intel® E5-2600 Series (Sandy-Bridge) and I don't see this behavior and the performance is comparable.

I'm using Intel MPI 5.0. Both the clusters use the same IB hardware. Profiling revealed MPI time is what is causing the performance drop. The application only performs MPI communication outside OpenMP regions. Any help will be appreciated.

Thanks,

GK

0 Kudos
3 Replies
Barry_T_Intel
Employee
867 Views

You should probably ask that question in the HPC forum: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology . They deal with MPI issues.

I'd start with the Intel MPI Library Troubleshooting Guide: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/563559 .

0 Kudos
Dmitry_P_Intel1
Employee
867 Views

Hello,

We posted the new Application Performance Snapshot 2018 Beta that can get some insight into MPI, OpenMP and memory access efficiency.

It should be very easy to deploy - just unsip and run.

Would be interesting to see side by side the reports from these two runs - will the statics give a clue what can be a reason of the performance drop.

Thanks & Regards, Dmitry

 

0 Kudos
Gandharv_K_
Beginner
867 Views

Thanks, Dmitry. I will report back.

- GK

0 Kudos
Reply