- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We just migrated from XE 2013 to XE 2016 update2. We use the compier suite and the MPI library to build the MPI enviroment for ab initio software such as PWscf(Quantum ESPRESSO) and OpenMX.
Before we migrate to the new compiler environment, the old compiler environment varibles have been cleaned up. We set mpiifort as the compiler and linker, and the complition complete normally.
On one of our workstations, the serial run appears to be okay, but the MPI runs appears to be very wired and the MPI parralleling appears to be not working properly.
For example, we can run properly with "mpirun -np 1 pw.x < pw.in"; however, if we use the command "mpirun -np 2 pw.x < pw.in", the program pw.x will respond as if it didn't see the input, and still wait for input from standard input, meanwhile the code itself told that it knew it was parallel running "Paralle version (MPI), running on 2 processors". If we increase the number of threads on the same workstation(4 physical cores), "mpirun -np 8 pw.x < pw.in", now the code will read input from file pw.in, and does not wait for standard input, but the speed of the run will be extremely slow, appear as if not running at all.
The problem is not from PWscf, because we found the same problem for OpenMX, the compilation is okay, and the serial version is running properly, just not the MPI version. All the codes run at a extreme slow speed that after the codes output something, the stdout will pause to output anything.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Yi,
What happens if you run the test program provided with the Intel® MPI Library? Does it exhibit similar behavior?
James.
Intel® Developer Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear James,
Thank you for the reply.
After unsuccessful trying, now we use the OpenMPI, it appears to be fine.
Just now, I switched off the OpenMPI environment variables, and sourced the mpivars.sh. I go to the path of INTEL_ROOT/compilers_and_libraries/linux/samples/en/mpi, and compiled them with mpiicc, mpiicpc and mpiifort accordingly.
The compiled codes gave similar results that, when I run mpirun -np N ./test, the result is as following
Hello world: rank 0 of N running on simbox
Hello world: rank 1 of N running on simbox
Hello world: rank 2 of N running on simbox
...
Hello world: rank N of N running on simbox
I guess that means the tests passed?
I know the information I provided is not complete at all. How can I provide better information?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That does indicate passing the basic tests. Can you run your application with I_MPI_HYDRA_DEBUG=1 and I_MPI_DEBUG=5? Redirect the output to a file and attach it here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
James T. (Intel) wrote:
That does indicate passing the basic tests. Can you run your application with I_MPI_HYDRA_DEBUG=1 and I_MPI_DEBUG=5? Redirect the output to a file and attach it here.
Hi James,
I have run the PWscf code compiled by Intel MPI environment.
I used two kinds of commands
1. mpirun -np 2 pw.x -input pwscff.rx.in the result is in x.txt
2. mpirun -np 2 pw.x < pwscff.rx.in the result is in y.txt
They should run on a single node workstation, while currently not. I used ctrl+C to kill the process when I found there is no more output.
The compilation is similar with OpenMPI environment, except the code is linked to mkl_blacs_intelmpi_lp64 when using Intel MPI, while linked to mkl_blacs_openmpi_lp64 when using OpenMPI. (of course the environment variables and PATH are set up accordingly, and we use mpif90 for OpenMPI(compiled with icc and ifort), while mpiifort for IntelMPI,)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hmm, everything here looks fine. Please send the ldd output for both the OpenMPI linked version and the Intel® MPI Library linked version.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have a very similar issue being reported by another customer, I'll go ahead and attach this to the same issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This should be resolved in Intel® MPI Library 2017.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page