- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am straggling with a simple MPI program when a vector derived datatype is used as origin/remote datatypes in the MPI_Put RMA operation. The underlying idea is quite simple: a contiguous piece of memory is interpreted as a 2D array and only its certain slice is going to be updated via MPI_Put. To this end, I am using a MPI_Type_vector call. Everything works fine when the stride value <= 24000, regardless count and blocklength. However, if stride is greater than 24000 then program simply hangs. The source code is attached and you can simply verify the issue by compiling and running it with the following parameters:
mpicc mpi_tvec_rma.c
mpirun -np 2 ./a.out 10000 2000
Ok
mpirun -np 2 ./a.out 20000 2000
Ok
mpirun -np 2 ./a.out 30000 2000
Hangs
So, I am really puzzled what is happening here and any help will be greatly appreciated!
With best regards,
Victor.
P.S. The attached code is perfectly runs with the OpenMPI v1.8.2.
P.P.S. Below are given some details about hardware and software setups used:
icc Version 14.0.3.174
impi Version 4.1.3.048
I_MPI_DEBUG=10
[0] MPI startup(): Intel(R) MPI Library, Version 4.1 Update 3 Build 20140124
[0] MPI startup(): Copyright (C) 2003-2014 Intel Corporation. All rights reserved.
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-259847 & 0-2147483647
[0] MPI startup(): Allgatherv: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 0-1536 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 1536-2194 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 2194-34792 & 0-2147483647
[0] MPI startup(): Allreduce: 4: 34792-121510 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 121510-145618 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 145618-668210 & 0-2147483647
[0] MPI startup(): Allreduce: 7: 668210-1546854 & 0-2147483647
[0] MPI startup(): Allreduce: 4: 1546854-2473237 & 0-2147483647
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-117964 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 117965-3131275 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 3: 1-921 & 0-2147483647
[0] MPI startup(): Gather: 1: 922-3027 & 0-2147483647
[0] MPI startup(): Gather: 3: 3028-5071 & 0-2147483647
[0] MPI startup(): Gather: 2: 5072-11117 & 0-2147483647
[0] MPI startup(): Gather: 1: 11118-86016 & 0-2147483647
[0] MPI startup(): Gather: 3: 86017-283989 & 0-2147483647
[0] MPI startup(): Gather: 1: 283990-664950 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 0-6 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[1] MPI startup(): Recognition=2 Platform(code=8 ippn=1 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 16924 n3 {0,1,2,3,4,5,6,7}
[0] MPI startup(): 1 16925 n3 {8,9,10,11,12,13,14,15}
[0] MPI startup(): Recognition=2 Platform(code=8 ippn=1 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): I_MPI_DEBUG=10
[0] MPI startup(): I_MPI_INFO_BRAND=Intel(R) Xeon(R)
[0] MPI startup(): I_MPI_INFO_CACHE1=0,1,2,3,4,5,6,7,16,17,18,19,20,21,22,23
[0] MPI startup(): I_MPI_INFO_CACHE2=0,1,2,3,4,5,6,7,16,17,18,19,20,21,22,23
[0] MPI startup(): I_MPI_INFO_CACHE3=0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_CACHES=3
[0] MPI startup(): I_MPI_INFO_CACHE_SHARE=2,2,32
[0] MPI startup(): I_MPI_INFO_CACHE_SIZE=32768,262144,20971520
[0] MPI startup(): I_MPI_INFO_CORE=0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7
[0] MPI startup(): I_MPI_INFO_C_NAME=Unknown
[0] MPI startup(): I_MPI_INFO_DESC=1342177285
[0] MPI startup(): I_MPI_INFO_FLGB=0
[0] MPI startup(): I_MPI_INFO_FLGC=532603903
[0] MPI startup(): I_MPI_INFO_FLGD=-1075053569
[0] MPI startup(): I_MPI_INFO_LCPU=16
[0] MPI startup(): I_MPI_INFO_MODE=263
[0] MPI startup(): I_MPI_INFO_PACK=0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_SERIAL=E5-2660 0
[0] MPI startup(): I_MPI_INFO_SIGN=132823
[0] MPI startup(): I_MPI_INFO_STATE=0
[0] MPI startup(): I_MPI_INFO_THREAD=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
[0] MPI startup(): I_MPI_INFO_VEND=1
[0] MPI startup(): I_MPI_PIN_INFO=x0,1,2,3,4,5,6,7
[0] MPI startup(): I_MPI_PIN_MAPPING=2:0 0,1 8
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
JFYI, there is a bug in MPICH and MPICH2/MPICH3 also suffer from this problem:
http://trac.mpich.org/projects/mpich/ticket/2189
Best,
Victor.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page