- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to free a communicator created with this call:
int MPI_Comm_spawn(char *command, char *argv[], int maxprocs,
MPI_Info info, int root, MPI_Comm comm, MPI_Comm *intercomm, int array_of_errcodes[]) <-- The comunicator created it's intercomm
As far as I know, according to the standard, MPI_Free is a collective operation, although they suggest to implement it locally, however on Intel MPI it's a collective operation (according to my own experience and to http://software.intel.com/sites/products/documentation/hpc/ics/itac/81/ITC_Reference_Guide/Freeing_Communicators.htm ).
However I have a problem here, father/spawners process/es will have a communicator which contains his sons, and the spawned processes/sons will have the communicator which contains the masters.
How I can free the communicator of the master with this layout? I know that I can create a new communicator with both sons and masters and free with that, but then that won't be the same communicator that I want to free.
Thanks beforehand,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Florentino,
If you want to free the spawned communicator, simply call MPI_Comm_free from all of the spawning ranks and from all of the spawned ranks. You can call MPI_Comm_free from less ranks, but this will only remove the reference in that rank. The communicator will exist until all references are freed, and is still usable as long as all necessary references remain.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Florentino,
If you want to free the spawned communicator, simply call MPI_Comm_free from all of the spawning ranks and from all of the spawned ranks. You can call MPI_Comm_free from less ranks, but this will only remove the reference in that rank. The communicator will exist until all references are freed, and is still usable as long as all necessary references remain.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Okay, I test that, however my previous test a free in only one of the masters and it would hang. I'll check it tomorrow freeing from all the nodes. I understand that in the "sons" I have to free the MPI_Comm_get_parent (will try this tomorrow).
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Interesting. I was able to launch 2 ranks that spawned 2 new ranks. I could free the communicator from parent rank 1 and still send data from parent rank 0 to both child ranks. I was then able to free the remaining communicator from the parent rank and from both child ranks. I did not encounter a hang in any scenario. That behavior might change on Linux*, I was testing on Windows*.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi again, thanks for testing that.
You are right, it works perfectly. I also had a few tests in which it worked, but my "real app" (which uses MPI Comm spawn after a few layers of my "own" middle-ware) was using multi-level spawns (one spawned process uses mpi comm spawn), so I thought the problem was related to this, at the end I built a more complex test app and it was working correctly too.
After a few more tests I realised I had "small" bug in my software, I was passing a NULL parameter to the MPI_Comm_free MPI_Comm_free(NULL) in the second level of Spawns (aka, Master --> Spawned LV1 --> Spawn LV2 (this) );
I've tested that behaviour in my test app and there is something "very" strange:
1- If I use "export I_MPI_DEBUG=3", MPI_Comm_free(0) HANGS (and leaves quite a few un-killable zombie processes),
2- If I don't use "export I_MPI_DEBUG=3", MPI_Comm_free(0) crashes, execution fails and no zombie processes are left.
Maybe you want to take a look into this behaviour for correctness, anyway I understand that the problem was of my/user code.
I'm sorry for the inconveniences, after fixing my bug so the user code is "correct", everything is working correctly.
If you want to investigate this issues, I attach my test file.
mpiicc helloworld_x86.c -o hello.out.x86_64 -mt_mpi (mt_mpi is optional and not needed to reproduce the problem)
export I_MPI_DEBUG=3
export I_MPI_PIN_MODE=mpd
mpirun -n 1 -host $HOST_ADDRESS ./hello.out.x86_64
Regards (all these tests are done on Linux, but bugs here are minor).
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page