- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear customer,
Your question is more relevant to MPI not MKL. I will transfer your thread to MPI forum zone. Thank you.
Best regards,
	Fiona
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fiona Z. (Intel) wrote:
Dear customer,
Your question is more relevant to MPI not MKL. I will transfer your thread to MPI forum zone. Thank you.
Best regards,
Fiona
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dong,
What is your OS and Intel MPI version? Could please send me the outputs of your MPI environment, and the debug results when exporting I_MPI_DEBUG=6. Thanks.
Best Regards,
Zhuowei
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Si, Zhuowei wrote:
Hi Dong,
What is your OS and Intel MPI version? Could please send me the outputs of your MPI environment, and the debug results when exporting I_MPI_DEBUG=6. Thanks.
Best Regards,
Zhuowei
Hello Zhuowei.
thanks for your help!
I test my code on two workstations. One of the workstations runs with ubuntu 16.04 LTS, and one runs with Debian GNU/Linux 8.
Intel MPI version of the two workstations is Intel(R) MPI Library 2017 Update 2 for Linux.
My MPI environment is set in .bashrc like this:
export PATH=$PATH:/opt/intel/bin export LD_LIBRARY_PATH=$LD_LIBRARY_LIB:/opt/intel/mkl/lib/intel64:/opt/intel/lib/intel64: source /opt/intel/bin/compilervars.sh intel64 source /opt/intel/mkl/bin/mklvars.sh intel64 export INTEL_LICENSE_FILE=/opt/intel/licenses
This is my c++ code:
# include <mpi.h>
# include <iostream>
# include <unistd.h>
void main(int argc,char *argv[])
{
  MPI_Init(&argc,&argv);
  
  int processor_id_temp;
  MPI_Comm_rank(MPI_COMM_WORLD,&processor_id_temp);
  const int processor_id = processor_id_temp;
  char*const buf = new char[BCAST_SIZE];
  sprintf(buf, "Hello! (from processor id %d)", processor_id);
  const int color = (processor_id>0 ? 1 : 0);
  MPI_Comm MPI_COMM_TEST;
  MPI_Comm_split(MPI_COMM_WORLD,
		 color,
		 processor_id,
		 &MPI_COMM_TEST);
  
  MPI_Bcast(buf,
	    BCAST_SIZE,
	    MPI_CHAR,
	    0,
	    MPI_COMM_TEST);
  usleep(processor_id * 10000);
    
  std::cout<<"processor id "
	   <<processor_id
	   <<", color "
	   <<color
	   <<": "
	   <<buf
	   <<std::endl;
  delete [] buf;
    
  MPI_Finalize();
}
This is the result on the workstation with ubuntu:
$ export I_MPI_FABRICS=shm
$ export I_MPI_DEBUG=6
$ for size in 32768 131072; do mpiicpc -DBCAST_SIZE=${size} mpi_comm_split.cpp; mpirun -n 3 ./a.out; echo; done
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2  Build 20170125 (id: 16752)
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[2] MPI startup(): shm data transfer mode
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 3: 0-0 & 0-2147483647
[0] MPI startup(): Allgather: 1: 1-6459 & 0-2147483647
[0] MPI startup(): Allgather: 5: 6460-14628 & 0-2147483647
[0] MPI startup(): Allgather: 1: 14629-25466 & 0-2147483647
[0] MPI startup(): Allgather: 3: 25467-36131 & 0-2147483647
[0] MPI startup(): Allgather: 5: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 1: 0-7199 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-4 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 5-8 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 9-32 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 33-64 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 65-341 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 342-6656 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 6657-8192 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 8193-113595 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 113596-132320 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 132321-1318322 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-25 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 26-37 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 38-1024 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 1025-4096 & 0-2147483647
[0] MPI startup(): Alltoall: 2: 4097-70577 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 1: 0-0 & 0-2147483647
[0] MPI startup(): Bcast: 8: 1-12746 & 0-2147483647
[0] MPI startup(): Bcast: 1: 12747-42366 & 0-2147483647
[0] MPI startup(): Bcast: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 1: 0-0 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 6-128 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 3: 129-89367 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-0 & 0-2147483647
[0] MPI startup(): Reduce: 7: 1-39679 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 1: 0-0 & 0-2147483647
[0] MPI startup(): Scatter: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[1] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[2] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       4440     yd-ws1     {0,4}
[0] MPI startup(): 1       4441     yd-ws1     {1,5}
[0] MPI startup(): 2       4442     yd-ws1     {2,6}
[0] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): I_MPI_DEBUG=6
[0] MPI startup(): I_MPI_FABRICS=shm
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=1
[0] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2
processor id 0, color 0: Hello! (from processor id 0)
processor id 1, color 1: Hello! (from processor id 1)
processor id 2, color 1: Hello! (from processor id 1)
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2  Build 20170125 (id: 16752)
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[2] MPI startup(): shm data transfer mode
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 3: 0-0 & 0-2147483647
[0] MPI startup(): Allgather: 1: 1-6459 & 0-2147483647
[0] MPI startup(): Allgather: 5: 6460-14628 & 0-2147483647
[0] MPI startup(): Allgather: 1: 14629-25466 & 0-2147483647
[0] MPI startup(): Allgather: 3: 25467-36131 & 0-2147483647
[0] MPI startup(): Allgather: 5: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 1: 0-7199 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-4 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 5-8 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 9-32 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 33-64 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 65-341 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 342-6656 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 6657-8192 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 8193-113595 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 113596-132320 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 132321-1318322 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-25 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 26-37 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 38-1024 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 1025-4096 & 0-2147483647
[0] MPI startup(): Alltoall: 2: 4097-70577 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 1: 0-0 & 0-2147483647
[0] MPI startup(): Bcast: 8: 1-12746 & 0-2147483647
[0] MPI startup(): Bcast: 1: 12747-42366 & 0-2147483647
[0] MPI startup(): Bcast: 7: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 1: 0-0 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 6-128 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 3: 129-89367 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-0 & 0-2147483647
[0] MPI startup(): Reduce: 7: 1-39679 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 1: 0-0 & 0-2147483647
[0] MPI startup(): Scatter: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[1] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[2] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       4468     yd-ws1     {0,4}
[0] MPI startup(): 1       4469     yd-ws1     {1,5}
[0] MPI startup(): 2       4470     yd-ws1     {2,6}
[0] MPI startup(): Recognition=2 Platform(code=32 ippn=2 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): I_MPI_DEBUG=6
[0] MPI startup(): I_MPI_FABRICS=shm
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=1
[0] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2
processor id 0, color 0: Hello! (from processor id 0)
When BCAST_SIZE=131072, the processors 1 and 2 couldn't output(line 32 of the code) and they was stopped by Ctrl+C.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dong,
Could you please try to set the I_MPI_SHM_FBOX/I_MPI_SHM_LMT (https://software.intel.com/en-us/node/528902?language=es), does this help on the hang-up?
Best Regards,
Zhuowei
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page