- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Collegues,
Recently, I've developed an MPI program that performs sorting of an array of N=10^6 data items. All data items have a type of __int64. The entire sorting is workshared between multiple > 10 processes. To share data between processes I've created an MPI window using MPI_Win_allocate_shared function. While performing an actual sorting of the array of N=10^6 data items running it using 10 or more processes the program hangs (e.g. the sorting process is endless). The program performs the correct sorting only by running it using no more than 2 processes.
Can you help me to figure out why this program cannot be executed using more than 2 processors (see attachment) ?
I've compiled and run the program as follows:
mpiicpc -o sortmpi_shared.exe sortmpi_shared.cpp
mpiexec -np 10 sortmpi_shared.exe
Thanks a lot. Waiting for your reply.
Cheers, Arthur.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Arthur,
Can you provide a source? I found only executable and cfg files in your zip file.
Thanks,
Mark
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Arthur,
Can you provide a source? I found only executable and cfg files in your zip file.
Thanks,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This problem is already solved. Thank you.
Really, I've got another question: under unknown reason my MPI program doesn't work (hangs) when you launch processes on different nodes (hosts). In my program I use MPI_Win_allocate_shared function to allocate shared memory using RMA window. And I'm wondering what is the possible cause why my program doesn't work. Do I actually need to implement intercommunicators for that purpose?
I'm sorry but I can't provide any sources yet.
Waiting for your reply.
Cheers, Arthur.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You do not need to implement intercommunicators. This paper
http://goparallel.sourceforge.net/wp-content/uploads/2015/06/PUM21-2-An_Introduction_to_MPI-3.pdf
contains links to the downloadable sources illustrating MPI-3 Shared Memory programming model in the multi-node setting, e.g.:
http://tinyurl.com/MPI-SHM-example
Could you try to run this first example from the paper on your cluster (and provide results)?
Here is another quote from the paper that might help: "The function MPI_Comm_split_type enables programmers to determine the maximum groups of MPI ranks that allow such memory sharing. This function has a powerful capability to create “islands” of processes on each node that belong to the output communicator shmcomm". Do you use this function?
You'd also need to distinguish between ranks on the node versus ranks belonging to different nodes. As you can see, we used MPI_Group_translate_ranks for this purpose.
Cheers,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Mark.
I've already tested the following example you have provided on my cluster. Here's results:
E:\>mpiexec -n 4 -ppn 2 -hosts 2 192.168.0.100 1 192.168.0.150 1 1.exe
Fatal error in MPI_Win_lock_all: Invalid MPI_Win, error stack:
MPI_Win_lock_all(158): MPI_Win_lock_all(MPI_MODE_NOCHECK, win=0x0) failed
MPI_Win_lock_all(103): Invalid MPI_Win
Fatal error in MPI_Win_lock_all: Invalid MPI_Win, error stack:
MPI_Win_lock_all(158): MPI_Win_lock_all(MPI_MODE_NOCHECK, win=0x0) failed
MPI_Win_lock_all(103): Invalid MPI_Win
Fatal error in MPI_Win_lock_all: Invalid MPI_Win, error stack:
MPI_Win_lock_all(158): MPI_Win_lock_all(MPI_MODE_NOCHECK, win=0x5f) failed
MPI_Win_lock_all(103): Invalid MPI_Win
Fatal error in MPI_Win_lock_all: Invalid MPI_Win, error stack:
MPI_Win_lock_all(158): MPI_Win_lock_all(MPI_MODE_NOCHECK, win=0x98) failed
MPI_Win_lock_all(103): Invalid MPI_Win
E:\>mpiexec -n 4 1.exe
i'm rank 2 with 2 intranode partners, 1 (1), 3 (3)
load MPI/SHM values from neighbour: rank 1, numtasks 4 on COMP-PC.MYHOME.NET
load MPI/SHM values from neighbour: rank 3, numtasks 4 on COMP-PC.MYHOME.NET
i'm rank 3 with 2 intranode partners, 2 (2), 0 (0)
load MPI/SHM values from neighbour: rank 2, numtasks 4 on COMP-PC.MYHOME.NET
load MPI/SHM values from neighbour: rank 0, numtasks 4 on COMP-PC.MYHOME.NET
i'm rank 1 with 2 intranode partners, 0 (0), 2 (2)
load MPI/SHM values from neighbour: rank 0, numtasks 4 on COMP-PC.MYHOME.NET
load MPI/SHM values from neighbour: rank 2, numtasks 4 on COMP-PC.MYHOME.NET
i'm rank 0 with 2 intranode partners, 3 (3), 1 (1)
load MPI/SHM values from neighbour: rank 3, numtasks 4 on COMP-PC.MYHOME.NET
load MPI/SHM values from neighbour: rank 1, numtasks 4 on COMP-PC.MYHOME.NET
*BUT*, I actually can't figure out how the following sample can be used to solve the problem I stated ?
Solving this problem my goal is not to use MPI_Send/MPI_Recv between processes of different nodes.
Normally I need to use MPI_Comm_split_type, MPI_Win_allocate_shared, MPI_Win_shared_query functions.
In your recent post, you've told me that MPI_Comm_split_type has a powerful capability to create process islands on
different nodes (hosts). Can you tell me or provide a sample how to do it ?
Thanks in advance. Waiting for your reply.
Cheers, Arthur.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'd need to reproduce this error.
Quick comments regarding your questions.
MPI-3 SHM should not be confused with PGAS (with its global address space) or one-sided/RMA even it relies on MPI-3 RMA framework. MPI-3 SHM programming model enables MPI ranks within a shared memory domain (typically processes on the same node) to allocate shared memory for direct load/store access. In this sense, it is exactly like hybrid MPI +OpenMP (or threads) model. So, when you said that you do not want to use MPI_Send/MPI_Recv between the nodes - what mechanism/functions then do you want to use instead?
The sample and paper (I referenced in my previous post)already contain all API functions you mentioned including recommended use model for MPI_Comm_split_type. Figure 2 in the paper hopefully might be helpful too. Said that, please do not hesitate to ask additional questions.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mark, Thanks a lot for your answer. I'd so much appreciate it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And one more question: is it possible to implement global address space shared between multiple nodes (hosts) using MPI and not using PGAS ? Can you point me at what particular framework like MPI-3 RMA can be used for that purpose ? Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And the last question: how can PGAS be used along with MPI library ? Can you post an example if it's possible ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And one more thing, recently I've tried to allocate shared memory on multiple nodes using RMA window.using MPI_Win_Create, MPI_Get, MPI_Put functions and it worked for me similarly as if I've used MPI_Send, MPI_Recv function. Can you explain me why it doesn't work only when you use MPI_Win_allocate_shared, MPI_Comm_split_type functions ??
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, PGAS can be implemented using MPI-3 RMA, e.g., please see (and references therein)
DART: http://arxiv.org/pdf/1507.01773.pdf
OpenSHMEM: http://www.csm.ornl.gov/workshops/openshmem2013/documents/ImplementingOpenSHMEM%20UsingMPI-3.pdf
http://mug.mvapich.cse.ohio-state.edu/static/media/mug/presentations/2014/hammond.pdf
These two preprints from ANL are also excellent:
http://www.mcs.anl.gov/uploads/cels/papers/P4014-0113.pdf
http://www.mcs.anl.gov/papers/P4062-0413_1.pdf
Yes, PGAS can be used along with MPI, e.g., MVAPICH team at OSU, supports such MPI/PGAS hybrid model through its MVAPICH2-X offering :
http://mvapich.cse.ohio-state.edu/
This is a good presentation from this group on the subject:
http://mvapich.cse.ohio-state.edu/static/media/talks/slide/osc_theater-PGAS.pdf
on your last question:
"I've tried to allocate shared memory on multiple nodes using RMA window.using MPI_Win_Create, MPI_Get, MPI_Put functions and it worked for me similarly as if I've used MPI_Send, MPI_Recv function. Can you explain me why it doesn't work only when you use MPI_Win_allocate_shared, MPI_Comm_split_type functions ??"
As I said above, MPI-3 SHM model (using MPI_Win_allocate_shared, MPI_Comm_split_type, etc.) is closer to the hybrid MPI + Open MP model rather than to RMA even it relies on RMA. If you look under the hood, MPI-3 SHM provides direct load/store memory access exactly like in the case of threads (btw with all its well-known pitfalls such as data races, etc.).
Citing http://www.mcs.anl.gov/~thakur/papers/shmem-win.pdf,
while in
"one-sided communication interface, the user allocates memory and then exposes it in a window. This model of window creation is not compatible with the inter-process shared-memory support provided by most operating systems",
in MPI-3 SHM, through the mechanism described in this last paper, we end up with the truly shared memory environment, so for example,
"Load/store operations do not pass through the MPI library; and, as a result, MPI is unaware of which locations were accessed and whether data was updated"
Best,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for reference links, Mark. I'm going to read this documentation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you give me an example of using openshmem along with MPI library?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page