- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am trying to call MPI from within OpenMP regions, but I cannot have it working properly; my program compiles OK using mpiicc (4.1.1.036) and icc (13.1.2 20130514). I checked that it was linked against thread-safe libraries (libmpi_mt.so appears when I run ldd).
But when I try to run it (2 Ivybridge nodes x 2 MPI tasks x 12 OpenMP threads), I get a SIGSEGV without any backtrace :
/opt/softs/intel/impi/4.1.1.036/intel64/bin/mpirun -np 4 -ppn 2 ./mpitest.x
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
Or with debug level set to 5 :
/opt/softs/intel/impi/4.1.1.036/intel64/bin/mpirun -np 4 -ppn 2 ./mpitest.x
[1] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[0] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): shm and dapl data transfer modes
[2] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[3] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[2] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[2] MPI startup(): shm and dapl data transfer modes
[3] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[3] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 90871 beaufix522 {0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35}
[0] MPI startup(): 1 90872 beaufix522 {12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47}
[0] MPI startup(): 2 37690 beaufix523 {0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35}
[0] MPI startup(): 3 37691 beaufix523 {12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47}
[0] MPI startup(): I_MPI_DEBUG=5
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_DIST=10,15,15,10
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=mlx4_0:0
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] MPI startup(): I_MPI_PIN_MAPPING=2:0 0,1 12
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
Of course, if I use a single OpenMP thread, everything works fine. I also tried to wrap calls to MPI into critical regions, which works, but is not what I want.
My program is just a small test case to figure out whether I can try this pattern inside a bigger program. For each MPI task, all OpenMP threads are used to send messages to other tasks, and afterwards, all OpenMP threads are used to receive messages from other tasks.
My questions are :
- does my program conforms to the thread level MPI_THREAD_MULTIPLE (which btw is returned by MPI_Init_thread) ?
- is IntelMPI supposed to run it correctly ?
- if not, will it work someday ?
- what can I do now (extra tests, etc...) ?
Best regards,
Philippe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Philippe,
One solution is to use MPI_Send or MPI_Isend instead of MPI_Bsend. Will either of these work in your program?
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Among the simpler possibilities are that you need to allow more stack, either in the shell or kmp_stacksize or both.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Tim,
I already had OMP_STACKSIZE=20000M and ulimit -s unlimited; I added KMP_STACKSIZE=20000M and got this:
Fatal error in MPI_Bsend: Internal MPI error!, error stack:
MPI_Bsend(195)..............: MPI_Bsend(buf=0x2ae0f1fff3ec, count=2, MPI_INT, dest=2, tag=0, MPI_COMM_WORLD) failed
MPIR_Bsend_isend(226).......:
MPIR_Bsend_check_active(456):
MPIR_Test_impl(63)..........:
MPIR_Request_complete(227)..: INTERNAL ERROR: unexpected value in case statement (value=0)
APPLICATION TERMINATED WITH THE EXIT STRING: Interrupt (signal 2)
Regards,
Philippe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Philippe,
One solution is to use MPI_Send or MPI_Isend instead of MPI_Bsend. Will either of these work in your program?
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have never seen a successful setting of KMP_STACKSIZE greater than 40M. I guess OMP_STACKSIZE would be preferable but would mean the same thing. With 24 threads each set to KMP_STACKSIZE of 20GB you would need 480GB per node just for the thread stacks. I haven't seen a system where ulimit -s unlimited could give you that much.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page