- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear customer,
Your question is more relevant to MPI not MKL. I will transfer your thread to MPI forum zone. Thank you.
Best regards,
Fiona
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fiona Z. (Intel) wrote:
Dear customer,
Your question is more relevant to MPI not MKL. I will transfer your thread to MPI forum zone. Thank you.
Best regards,
Fiona
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dong,
What is your OS and Intel MPI version? Could please send me the outputs of your MPI environment, and the debug results when exporting I_MPI_DEBUG=6. Thanks.
Best Regards,
Zhuowei
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Si, Zhuowei wrote:
Hi Dong,
What is your OS and Intel MPI version? Could please send me the outputs of your MPI environment, and the debug results when exporting I_MPI_DEBUG=6. Thanks.
Best Regards,
Zhuowei
Hello Zhuowei.
thanks for your help!
I test my code on two workstations. One of the workstations runs with ubuntu 16.04 LTS, and one runs with Debian GNU/Linux 8.
Intel MPI version of the two workstations is Intel(R) MPI Library 2017 Update 2 for Linux.
My MPI environment is set in .bashrc like this:
export PATH=$PATH:/opt/intel/bin export LD_LIBRARY_PATH=$LD_LIBRARY_LIB:/opt/intel/mkl/lib/intel64:/opt/intel/lib/intel64: source /opt/intel/bin/compilervars.sh intel64 source /opt/intel/mkl/bin/mklvars.sh intel64 export INTEL_LICENSE_FILE=/opt/intel/licenses
This is my c++ code:
# include <mpi.h>
# include <iostream>
# include <unistd.h>
void main(int argc,char *argv[])
{
MPI_Init(&argc,&argv);
int processor_id_temp;
MPI_Comm_rank(MPI_COMM_WORLD,&processor_id_temp);
const int processor_id = processor_id_temp;
char*const buf = new char[BCAST_SIZE];
sprintf(buf, "Hello! (from processor id %d)", processor_id);
const int color = (processor_id>0 ? 1 : 0);
MPI_Comm MPI_COMM_TEST;
MPI_Comm_split(MPI_COMM_WORLD,
color,
processor_id,
&MPI_COMM_TEST);
MPI_Bcast(buf,
BCAST_SIZE,
MPI_CHAR,
0,
MPI_COMM_TEST);
usleep(processor_id * 10000);
std::cout<<"processor id "
<<processor_id
<<", color "
<<color
<<": "
<<buf
<<std::endl;
delete [] buf;
MPI_Finalize();
}
This is the result on the workstation with ubuntu:
$ export I_MPI_FABRICS=shm
$ export I_MPI_DEBUG=6
$ for size in 32768 131072; do mpiicpc -DBCAST_SIZE=${size} mpi_comm_split.cpp; mpirun -n 3 ./a.out; echo; done
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2 Build 20170125 (id: 16752)
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation. All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[2] MPI startup(): shm data transfer mode
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 3: 0-0 & 0-2147483647
[0] MPI startup(): Allgather: 1: 1-6459 & 0-2147483647
[0] MPI startup(): Allgather: 5: 6460-14628 & 0-2147483647
[0] MPI startup(): Allgather: 1: 14629-25466 & 0-2147483647
[0] MPI startup(): Allgather: 3: 25467-36131 & 0-2147483647
[0] MPI startup(): Allgather: 5: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 1: 0-7199 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-4 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 5-8 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 9-32 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 33-64 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 65-341 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 342-6656 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 6657-8192 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 8193-113595 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 113596-132320 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 132321-1318322 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-25 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 26-37 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 38-1024 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 1025-4096 & 0-2147483647
[0] MPI startup(): Alltoall: 2: 4097-70577 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 1: 0-0 & 0-2147483647
[0] MPI startup(): Bcast: 8: 1-12746 & 0-2147483647
[0] MPI startup(): Bcast: 1: 12747-42366 & 0-2147483647
[0] MPI startup(): Bcast: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 1: 0-0 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 6-128 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 3: 129-89367 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-0 & 0-2147483647
[0] MPI startup(): Reduce: 7: 1-39679 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 1: 0-0 & 0-2147483647
[0] MPI startup(): Scatter: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[1] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[2] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 4440 yd-ws1 {0,4}
[0] MPI startup(): 1 4441 yd-ws1 {1,5}
[0] MPI startup(): 2 4442 yd-ws1 {2,6}
[0] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): I_MPI_DEBUG=6
[0] MPI startup(): I_MPI_FABRICS=shm
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=1
[0] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2
processor id 0, color 0: Hello! (from processor id 0)
processor id 1, color 1: Hello! (from processor id 1)
processor id 2, color 1: Hello! (from processor id 1)
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2 Build 20170125 (id: 16752)
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation. All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[2] MPI startup(): shm data transfer mode
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 3: 0-0 & 0-2147483647
[0] MPI startup(): Allgather: 1: 1-6459 & 0-2147483647
[0] MPI startup(): Allgather: 5: 6460-14628 & 0-2147483647
[0] MPI startup(): Allgather: 1: 14629-25466 & 0-2147483647
[0] MPI startup(): Allgather: 3: 25467-36131 & 0-2147483647
[0] MPI startup(): Allgather: 5: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 1: 0-7199 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-4 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 5-8 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 9-32 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 33-64 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 65-341 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 342-6656 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 6657-8192 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 8193-113595 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 113596-132320 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 132321-1318322 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-25 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 26-37 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 38-1024 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 1025-4096 & 0-2147483647
[0] MPI startup(): Alltoall: 2: 4097-70577 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 1: 0-0 & 0-2147483647
[0] MPI startup(): Bcast: 8: 1-12746 & 0-2147483647
[0] MPI startup(): Bcast: 1: 12747-42366 & 0-2147483647
[0] MPI startup(): Bcast: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 1: 0-0 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 6-128 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 3: 129-89367 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-0 & 0-2147483647
[0] MPI startup(): Reduce: 7: 1-39679 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 1: 0-0 & 0-2147483647
[0] MPI startup(): Scatter: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[1] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[2] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 4468 yd-ws1 {0,4}
[0] MPI startup(): 1 4469 yd-ws1 {1,5}
[0] MPI startup(): 2 4470 yd-ws1 {2,6}
[0] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): I_MPI_DEBUG=6
[0] MPI startup(): I_MPI_FABRICS=shm
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=1
[0] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2
processor id 0, color 0: Hello! (from processor id 0)
When BCAST_SIZE=131072, the processors 1 and 2 couldn't output(line 32 of the code) and they was stopped by Ctrl+C.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dong,
Could you please try to set the I_MPI_SHM_FBOX/I_MPI_SHM_LMT (https://software.intel.com/en-us/node/528902?language=es), does this help on the hang-up?
Best Regards,
Zhuowei
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page