Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4975 Discussions

What could be a cause of the MPI program to run slower on multinode cluster?

ArthurRatz
Novice
388 Views

Hello everyone,

I've got a question about MPI program performance: I've developed an MPI program that processes large amounts of data (about 10^9) elements, and running this program I've noticed that as many processes I create using mpiexec utility as longer the duration of the program execution. What could be a cause of the following issue ?? When I run this program in a single computational node, it works faster rather running that using two computational nodes. Please, help.

Regards, Arthur.

0 Kudos
5 Replies
ArthurRatz
Novice
388 Views

Normally I use the following computational platform: 2 x Intel Core i7 - 4970 4.00 GHZ, 32GB RAM, Network: 1 Gbps.

0 Kudos
ArthurRatz
Novice
388 Views

My MPI program actually sorts a huge array containing 10^9 elements by splitting the entire array into chunks sorted by each process created by mpiexec utility. The actual sorting is performed using tbb::parallel_sort routine which is a part of Threading Building Blocks (TBB).

0 Kudos
Barry_T_Intel
Employee
388 Views

You should probably ask that question in the HPC forum: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology . They deal with MPI issues.

I'd start with the Intel MPI Library Troubleshooting Guide: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/563559 .

There's also a TBB forum: https://software.intel.com/en-us/forums/intel-threading-building-blocks

0 Kudos
ArthurRatz
Novice
388 Views

If anyone who is going to answer my question needs an executable to test it on his side, I'm ready provide one.

0 Kudos
ArthurRatz
Novice
389 Views
0 Kudos
Reply