- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear customer,
Your question is more relevant to MPI not MKL. I will transfer your thread to MPI forum zone. Thank you.
Best regards,
Fiona
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fiona Z. (Intel) wrote:
Dear customer,
Your question is more relevant to MPI not MKL. I will transfer your thread to MPI forum zone. Thank you.
Best regards,
Fiona
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dong,
What is your OS and Intel MPI version? Could please send me the outputs of your MPI environment, and the debug results when exporting I_MPI_DEBUG=6. Thanks.
Best Regards,
Zhuowei
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Si, Zhuowei wrote:
Hi Dong,
What is your OS and Intel MPI version? Could please send me the outputs of your MPI environment, and the debug results when exporting I_MPI_DEBUG=6. Thanks.
Best Regards,
Zhuowei
Hello Zhuowei.
thanks for your help!
I test my code on two workstations. One of the workstations runs with ubuntu 16.04 LTS, and one runs with Debian GNU/Linux 8.
Intel MPI version of the two workstations is Intel(R) MPI Library 2017 Update 2 for Linux.
My MPI environment is set in .bashrc like this:
export PATH=$PATH:/opt/intel/bin export LD_LIBRARY_PATH=$LD_LIBRARY_LIB:/opt/intel/mkl/lib/intel64:/opt/intel/lib/intel64: source /opt/intel/bin/compilervars.sh intel64 source /opt/intel/mkl/bin/mklvars.sh intel64 export INTEL_LICENSE_FILE=/opt/intel/licenses
This is my c++ code:
# include <mpi.h> # include <iostream> # include <unistd.h> void main(int argc,char *argv[]) { MPI_Init(&argc,&argv); int processor_id_temp; MPI_Comm_rank(MPI_COMM_WORLD,&processor_id_temp); const int processor_id = processor_id_temp; char*const buf = new char[BCAST_SIZE]; sprintf(buf, "Hello! (from processor id %d)", processor_id); const int color = (processor_id>0 ? 1 : 0); MPI_Comm MPI_COMM_TEST; MPI_Comm_split(MPI_COMM_WORLD, color, processor_id, &MPI_COMM_TEST); MPI_Bcast(buf, BCAST_SIZE, MPI_CHAR, 0, MPI_COMM_TEST); usleep(processor_id * 10000); std::cout<<"processor id " <<processor_id <<", color " <<color <<": " <<buf <<std::endl; delete [] buf; MPI_Finalize(); }
This is the result on the workstation with ubuntu:
$ export I_MPI_FABRICS=shm $ export I_MPI_DEBUG=6 $ for size in 32768 131072; do mpiicpc -DBCAST_SIZE=${size} mpi_comm_split.cpp; mpirun -n 3 ./a.out; echo; done [0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2 Build 20170125 (id: 16752) [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation. All rights reserved. [0] MPI startup(): Multi-threaded optimized library [0] MPI startup(): shm data transfer mode [1] MPI startup(): shm data transfer mode [2] MPI startup(): shm data transfer mode [0] MPI startup(): Device_reset_idx=8 [0] MPI startup(): Allgather: 3: 0-0 & 0-2147483647 [0] MPI startup(): Allgather: 1: 1-6459 & 0-2147483647 [0] MPI startup(): Allgather: 5: 6460-14628 & 0-2147483647 [0] MPI startup(): Allgather: 1: 14629-25466 & 0-2147483647 [0] MPI startup(): Allgather: 3: 25467-36131 & 0-2147483647 [0] MPI startup(): Allgather: 5: 0-2147483647 & 0-2147483647 [0] MPI startup(): Allgatherv: 1: 0-7199 & 0-2147483647 [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 0-4 & 0-2147483647 [0] MPI startup(): Allreduce: 1: 5-8 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 9-32 & 0-2147483647 [0] MPI startup(): Allreduce: 1: 33-64 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 65-341 & 0-2147483647 [0] MPI startup(): Allreduce: 1: 342-6656 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 6657-8192 & 0-2147483647 [0] MPI startup(): Allreduce: 2: 8193-113595 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 113596-132320 & 0-2147483647 [0] MPI startup(): Allreduce: 2: 132321-1318322 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 0-2147483647 & 0-2147483647 [0] MPI startup(): Alltoall: 3: 0-25 & 0-2147483647 [0] MPI startup(): Alltoall: 4: 26-37 & 0-2147483647 [0] MPI startup(): Alltoall: 3: 38-1024 & 0-2147483647 [0] MPI startup(): Alltoall: 4: 1025-4096 & 0-2147483647 [0] MPI startup(): Alltoall: 2: 4097-70577 & 0-2147483647 [0] MPI startup(): Alltoall: 4: 0-2147483647 & 0-2147483647 [0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647 [0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647 [0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647 [0] MPI startup(): Bcast: 1: 0-0 & 0-2147483647 [0] MPI startup(): Bcast: 8: 1-12746 & 0-2147483647 [0] MPI startup(): Bcast: 1: 12747-42366 & 0-2147483647 [0] MPI startup(): Bcast: 7: 0-2147483647 & 0-2147483647 [0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647 [0] MPI startup(): Gather: 1: 0-0 & 0-2147483647 [0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647 [0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 1: 6-128 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 3: 129-89367 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647 [0] MPI startup(): Reduce: 1: 0-0 & 0-2147483647 [0] MPI startup(): Reduce: 7: 1-39679 & 0-2147483647 [0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647 [0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647 [0] MPI startup(): Scatter: 1: 0-0 & 0-2147483647 [0] MPI startup(): Scatter: 3: 0-2147483647 & 0-2147483647 [0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647 [1] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0) [2] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0) [0] MPI startup(): Rank Pid Node name Pin cpu [0] MPI startup(): 0 4440 yd-ws1 {0,4} [0] MPI startup(): 1 4441 yd-ws1 {1,5} [0] MPI startup(): 2 4442 yd-ws1 {2,6} [0] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0) [0] MPI startup(): I_MPI_DEBUG=6 [0] MPI startup(): I_MPI_FABRICS=shm [0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=1 [0] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2 processor id 0, color 0: Hello! (from processor id 0) processor id 1, color 1: Hello! (from processor id 1) processor id 2, color 1: Hello! (from processor id 1) [0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2 Build 20170125 (id: 16752) [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation. All rights reserved. [0] MPI startup(): Multi-threaded optimized library [0] MPI startup(): shm data transfer mode [1] MPI startup(): shm data transfer mode [2] MPI startup(): shm data transfer mode [0] MPI startup(): Device_reset_idx=8 [0] MPI startup(): Allgather: 3: 0-0 & 0-2147483647 [0] MPI startup(): Allgather: 1: 1-6459 & 0-2147483647 [0] MPI startup(): Allgather: 5: 6460-14628 & 0-2147483647 [0] MPI startup(): Allgather: 1: 14629-25466 & 0-2147483647 [0] MPI startup(): Allgather: 3: 25467-36131 & 0-2147483647 [0] MPI startup(): Allgather: 5: 0-2147483647 & 0-2147483647 [0] MPI startup(): Allgatherv: 1: 0-7199 & 0-2147483647 [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 0-4 & 0-2147483647 [0] MPI startup(): Allreduce: 1: 5-8 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 9-32 & 0-2147483647 [0] MPI startup(): Allreduce: 1: 33-64 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 65-341 & 0-2147483647 [0] MPI startup(): Allreduce: 1: 342-6656 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 6657-8192 & 0-2147483647 [0] MPI startup(): Allreduce: 2: 8193-113595 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 113596-132320 & 0-2147483647 [0] MPI startup(): Allreduce: 2: 132321-1318322 & 0-2147483647 [0] MPI startup(): Allreduce: 7: 0-2147483647 & 0-2147483647 [0] MPI startup(): Alltoall: 3: 0-25 & 0-2147483647 [0] MPI startup(): Alltoall: 4: 26-37 & 0-2147483647 [0] MPI startup(): Alltoall: 3: 38-1024 & 0-2147483647 [0] MPI startup(): Alltoall: 4: 1025-4096 & 0-2147483647 [0] MPI startup(): Alltoall: 2: 4097-70577 & 0-2147483647 [0] MPI startup(): Alltoall: 4: 0-2147483647 & 0-2147483647 [0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647 [0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647 [0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647 [0] MPI startup(): Bcast: 1: 0-0 & 0-2147483647 [0] MPI startup(): Bcast: 8: 1-12746 & 0-2147483647 [0] MPI startup(): Bcast: 1: 12747-42366 & 0-2147483647 [0] MPI startup(): Bcast: 7: 0-2147483647 & 0-2147483647 [0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647 [0] MPI startup(): Gather: 1: 0-0 & 0-2147483647 [0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647 [0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 1: 6-128 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 3: 129-89367 & 0-2147483647 [0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647 [0] MPI startup(): Reduce: 1: 0-0 & 0-2147483647 [0] MPI startup(): Reduce: 7: 1-39679 & 0-2147483647 [0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647 [0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647 [0] MPI startup(): Scatter: 1: 0-0 & 0-2147483647 [0] MPI startup(): Scatter: 3: 0-2147483647 & 0-2147483647 [0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647 [1] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0) [2] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0) [0] MPI startup(): Rank Pid Node name Pin cpu [0] MPI startup(): 0 4468 yd-ws1 {0,4} [0] MPI startup(): 1 4469 yd-ws1 {1,5} [0] MPI startup(): 2 4470 yd-ws1 {2,6} [0] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0) [0] MPI startup(): I_MPI_DEBUG=6 [0] MPI startup(): I_MPI_FABRICS=shm [0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=1 [0] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2 processor id 0, color 0: Hello! (from processor id 0)
When BCAST_SIZE=
131072, the processors 1 and 2 couldn't output(line 32 of the code) and they was stopped by Ctrl+C.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dong,
Could you please try to set the I_MPI_SHM_FBOX/I_MPI_SHM_LMT (https://software.intel.com/en-us/node/528902?language=es), does this help on the hang-up?
Best Regards,
Zhuowei

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page