Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

MPI looks faster than OpenMPI on Xeon Phi

Michał_N_
Beginner
404 Views

Hi,

I encountered a strange behavior on Xeon Phi, when I check times of execution of my program (simple program to generate mandelbrot image) I see that application written in MPI was about 10 times faster than application written in OpenMP. It's very strange for me so I checked a simple program only with one empty loop, and this program execution was 700 (OpenMP) seconds to 900 (MPI), but when I added any math calculation inside the loop OpenMP was as fast as MPI implementation or even slower. So right know I don't know exactly what to think about it, because OpenMP should be faster than MPI or at least as fast as MPI.

I think it's not a problem with transfer because transfer is also included in times..

Anybody has an idea what is wrong with this?

0 Kudos
3 Replies
Bernard
Valued Contributor I
404 Views

Did you perform any VTune analysis, it is very hard to properly asses what had really happened  during your tests

0 Kudos
Dmitry_P_Intel1
Employee
404 Views

Hello,

Yes, we need more details - is it native run or offload (you mentioned transfers - are they data transfers?), how much ranks/openmp threads do you use? Did you run it under VTune or this info about original runs?

Thanks & Regards, Dmitry

0 Kudos
Michał_N_
Beginner
404 Views

Hi,

Yes I perform VTune analysis and it is an offload run. I found that, it is caused by several problems:

1. KMP_BLOCKTIME variable.

2. MPIRUN

Solution to my problem was to set KMP_BLOCKTIME variable to 0 and to invoke mpi program via mpiexec.hydra instead of mpirun, because mpirun had a problem with finding a mpiexec.hydra.. 

Thanks & Regards

Michał

0 Kudos
Reply