- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My MPI program doesn't work (hangs) when you launch processes on different nodes (hosts). In my program I use MPI_Win_allocate_shared function to allocate shared memory using RMA window. And I'm wondering what is the possible cause why my program doesn't work. Do I actually need to implement intercommunicators for that purpose? Here's the code:
MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, proc_rank, MPI_INFO_NULL, &comm_sm); MPI_Comm_rank(comm_sm, &rank_sm); MPI_Comm_size(comm_sm, &numprocs_sm); MPI_Info info_noncontig; MPI_Info_create(&info_noncontig); MPI_Info_set(info_noncontig, "alloc_shared_noncontig", "true"); int disp_size = sizeof(ullong); MPI_Aint array_size = number_of_items * disp_size; MPI_Win_allocate_shared(array_size, disp_size, info_noncontig, comm_sm, &array, &win_sm); MPI_Win_shared_query(win_sm, 0, &array_size, &disp_size, &array); MPI_Barrier(comm_sm); ullong i_start = proc_rank * number_of_items / (ullong)numprocs; ullong i_end = (proc_rank + 1) * number_of_items / (ullong)numprocs; MPI_Win_lock_all(MPI_MODE_NOCHECK, win_sm); if (proc_rank == 0) { ullong value = number_of_items - 1; srand((unsigned)time(NULL) + proc_rank * numprocs + namelen); for (ullong index = 0; index < number_of_items; index++, value--) array[index] = (rand_mode == 1) ? rand() % rand_seed + 1 : value; } MPI_Barrier(comm_sm); for (ullong index = i_start; index <= i_end; index++) fprintf(stdout, "%llu ", array[index]); fprintf(stdout, "\n\n"); fflush(stdout); MPI_Barrier(comm_sm);
Output:
[COMP-PC.MYHOME.NET@mpiexec] Process 0 of 2
71 81 12 56 66 49 70 39 100 90 27 57 46 66 6 13 39 20 70 4 6 13 16 5
56 60 90 44 97 5 87 51 44 12 7 54 70 5 29 65 95 69 70 44 45 38 87 1 9 80 54 78
67 77 68 13 16 78 79 40 98 50 74 6 52
[WIN-9MFH3O78GLQ.MYHOME.NET@mpiexec] Process 1 of 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
As you can see the process 1 doesn't receive the array buffer address ?!?!?!?!?!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi
as i can see at line 21 there is condition:
if
(proc_rank == 0)
it seems proc_rank is rank number in COMM_WORLD, right? in this case only one rank in COMM_WORLD fills array, and it fills array on same node only. but as i can see from output there are 2 different hosts:
COMP-PC.MYHOME.NET & WIN-9MFH3O78GLQ.MYHOME.NET (BTW, which OS used? Windows or Linux?), and array on host different from COMM_WORLD:rank 0 will not be updated.
it seems you should update your condition at line 21 to:
if
(rank_sm == 0)
thank you for bug report
--Sergey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi
as i can see at line 21 there is condition:
if
(proc_rank == 0)
it seems proc_rank is rank number in COMM_WORLD, right? in this case only one rank in COMM_WORLD fills array, and it fills array on same node only. but as i can see from output there are 2 different hosts:
COMP-PC.MYHOME.NET & WIN-9MFH3O78GLQ.MYHOME.NET (BTW, which OS used? Windows or Linux?), and array on host different from COMM_WORLD:rank 0 will not be updated.
it seems you should update your condition at line 21 to:
if
(rank_sm == 0)
thank you for bug report
--Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Sergey. Thank you very much for your reply. I'm going to check this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This question is outdated. I've already solved this problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And one more question is there any difference between contiguous and non-contiguous memory ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi
could you clarify your question?
in general: Contiguous means it's all in one chunk, so from the start to the end there's nothing else in it. Non-contiguous is the opposite, it means that the memory is fragmented and there are one or more sections that are allocated
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for reply.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page