Community
cancel
Showing results for 
Search instead for 
Did you mean: 
yi_w_1
Beginner
120 Views

MPI within parallel_studio_xe_2016_update2 not working under certain conditions

Hi,


We just migrated from XE 2013 to XE 2016 update2. We use the compier suite and the MPI library to build the MPI enviroment for ab initio software such as PWscf(Quantum ESPRESSO) and OpenMX.


Before we migrate to the new compiler environment, the old compiler environment varibles have been cleaned up. We set mpiifort as the compiler and linker, and the complition complete normally.


On one of our workstations, the serial run appears to be okay, but the MPI runs appears to be very wired and the MPI parralleling appears to be not working properly.


For example, we can run properly with "mpirun -np 1 pw.x < pw.in"; however, if we use the command "mpirun -np 2 pw.x < pw.in", the program pw.x will respond as if it didn't see the input, and still wait for input from standard input, meanwhile the code itself told that it knew it was parallel running "Paralle version (MPI), running on     2 processors". If we increase the number of threads on the same workstation(4 physical cores), "mpirun -np 8 pw.x < pw.in", now the code will read input from file pw.in, and does not wait for standard input, but the speed of the run will be extremely slow, appear as if not running at all.


The problem is not from PWscf, because we found the same problem for OpenMX, the compilation is okay, and the serial version is running properly, just not the MPI version. All the codes run at a extreme slow speed that after the codes output something, the stdout will pause to output anything.

 

0 Kudos
8 Replies
James_T_Intel
Moderator
120 Views

Hi Yi,

What happens if you run the test program provided with the Intel® MPI Library?  Does it exhibit similar behavior?

James.
Intel® Developer Support

yi_w_1
Beginner
120 Views

Dear James,

Thank you for the reply.

After unsuccessful trying, now we use the OpenMPI, it appears to be fine.

Just now, I switched off the OpenMPI environment variables, and sourced the mpivars.sh. I go to the path of INTEL_ROOT/compilers_and_libraries/linux/samples/en/mpi, and compiled them with mpiicc, mpiicpc and mpiifort accordingly.

The compiled codes gave similar results that, when I run mpirun -np N ./test, the result is as following

Hello world: rank 0 of N running on simbox

Hello world: rank 1 of N running on simbox

Hello world: rank 2 of N running on simbox

...

Hello world: rank N of N running on simbox

 

I guess that means the tests passed?

 

I know the information I provided is not complete at all. How can I provide better information?

James_T_Intel
Moderator
120 Views

That does indicate passing the basic tests.  Can you run your application with I_MPI_HYDRA_DEBUG=1 and I_MPI_DEBUG=5?  Redirect the output to a file and attach it here.

yi_w_1
Beginner
120 Views

James T. (Intel) wrote:

That does indicate passing the basic tests.  Can you run your application with I_MPI_HYDRA_DEBUG=1 and I_MPI_DEBUG=5?  Redirect the output to a file and attach it here.

Hi James,

I have run the PWscf code compiled by Intel MPI environment.

I used two kinds of commands

1. mpirun -np 2 pw.x -input pwscff.rx.in                  the result is in x.txt

2. mpirun -np 2 pw.x  < pwscff.rx.in           the result is in y.txt

They should run on a single node workstation, while currently not. I used ctrl+C to kill the process when I found there is no more output.

The compilation is similar with OpenMPI environment, except the code is linked to mkl_blacs_intelmpi_lp64 when using Intel MPI, while linked to mkl_blacs_openmpi_lp64 when using OpenMPI. (of course the environment variables and PATH are set up accordingly, and we use mpif90 for OpenMPI(compiled with icc and ifort), while mpiifort for IntelMPI,)

496681

496680

 

James_T_Intel
Moderator
120 Views

Hmm, everything here looks fine.  Please send the ldd output for both the OpenMPI linked version and the Intel® MPI Library linked version.

yi_w_1
Beginner
120 Views

Dear James,

I uploaded the ldd information. The ldd for the code compiled with OpenMPI is in lddx.txt, with IntelMPI in lddy.txt.

Thanks for keep putting your effort on this issue.
 

James_T_Intel
Moderator
120 Views

We have a very similar issue being reported by another customer, I'll go ahead and attach this to the same issue.

James_T_Intel
Moderator
120 Views

This should be resolved in Intel® MPI Library 2017.

Reply