Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Code hangs when variables value increases

Wee_Beng_T_
Beginner
2,300 Views

Hi,

I have the latest intel mpi 5.1, together with the fortran compiler in my own ubuntu linux. I tried to run my code but it hangs. It was working fine in different clusters before.

I realised the problem lies with mpi_bcast. I wrote a very simple program:

program mpi_bcast_test

implicit none

include 'mpif.h'

integer :: no_vertices,no_surfaces,size,myid,ierr,status

integer, allocatable :: tmp_mpi_data1(:)

call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)

if (myid==0) then

    no_vertices = 1554
    
    no_surfaces = 3104

end if

call MPI_BCAST(no_surfaces,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

call MPI_BCAST(no_vertices,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

allocate (tmp_mpi_data1(3*no_surfaces+11*no_vertices+1), STAT=status)

tmp_mpi_data1 = 0

if (myid==0) tmp_mpi_data1 = 100 

call MPI_BCAST(tmp_mpi_data1,3*no_surfaces+11*no_vertices+1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

print *, "myid,tmp_mpi_data1(2)",myid,tmp_mpi_data1(2)

call MPI_FINALIZE(ierr)

end program mpi_bcast_test

If I run as it is, it will hang at :

call MPI_BCAST(tmp_mpi_data1,3*no_surfaces+11*no_vertices+1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

But if I change the values of no_vertices and no_surfaces to small values, like 1 or 2, it works without problem.

I wonder why? Is there a bug in intel mpi 5.1 or my own problem?

Thanks

 

 

 

 

 

0 Kudos
22 Replies
James_T_Intel
Moderator
2,055 Views

Everything here looks fine, and I am able to run it on the latest version.  Please provide output from the following:

which mpirun
which mpiifort
mpirun -V
mpirun -n 2 -genv I_MPI_DEBUG 1000 <program>
ldd <program>

 

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Hi,

Hereś the outputs:

:~/myprojects/mpi_bcast_test$ which mpirun
/opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/bin/mpirun
:~/myprojects/mpi_bcast_test$ which mpiifort
/opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/bin/mpiifort
:~/myprojects/mpi_bcast_test$ mpirun -V
Intel(R) MPI Library for Linux* OS, Version 5.1.2 Build 20151015 (build id: 13147)
Copyright (C) 2003-2015, Intel Corporation. All rights reserved.
:~/myprojects/mpi_bcast_test$ ls
a.out  mpi_bcast_test.f90  mpi_bcast_test.sln  mpi_bcast_test.suo  mpi_bcast_test.vfproj  ReadMe.txt
:~/myprojects/mpi_bcast_test$ rm a.out
:~/myprojects/mpi_bcast_test$ mpiifort mpi_bcast_test.f90

:~/myprojects/mpi_bcast_test$ mpirun -n 2 -genv I_MPI_DEBUG 1000 ./a.out
[0] MPI startup(): Intel(R) MPI Library, Version 5.1.2  Build 20151015 (build id: 13147)
[0] MPI startup(): Copyright (C) 2003-2015 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPID_nem_impi_create_numa_nodes_map(): Fetching extra numa information from /etc/ofed-mic.map
[1] MPID_nem_impi_create_numa_nodes_map(): Fetching extra numa information from /etc/ofed-mic.map
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[0] MPI startup(): Recognition mode: 2, selected platform: 16 own platform: 16
[1] MPI startup(): Recognition mode: 2, selected platform: 16 own platform: 16
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-259847 & 0-2147483647
[0] MPI startup(): Allgatherv: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 0-32768 & 0-2147483647
[0] MPI startup(): Allreduce: 8: 32769-101386 & 0-2147483647
[0] MPI startup(): Allreduce: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-117964 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 117965-3131275 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 3: 1-921 & 0-2147483647
[0] MPI startup(): Gather: 1: 922-3027 & 0-2147483647
[0] MPI startup(): Gather: 3: 3028-5071 & 0-2147483647
[0] MPI startup(): Gather: 2: 5072-11117 & 0-2147483647
[0] MPI startup(): Gather: 1: 11118-86016 & 0-2147483647
[0] MPI startup(): Gather: 3: 86017-283989 & 0-2147483647
[0] MPI startup(): Gather: 1: 283990-664950 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 0-6 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Rank    Pid      Node name                 Pin cpu
[0] MPI startup(): 0       3874     Precision-T7610  {0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35}
[0] MPI startup(): 1       3875     Precision-T7610  {12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47}
[0] MPI startup(): Recognition=2 Platform(code=16 ippn=1 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Topology split mode = 1

| rank | node | space=1
|  0  |  0  |
|  1  |  0  |
[1] MPI startup(): Recognition=2 Platform(code=16 ippn=1 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[1] Bcast(): algo #0 is selected
[0] MPI startup(): I_MPI_DEBUG=1000
[0] MPI startup(): I_MPI_INFO_BRAND=Intel(R) Xeon(R)
[0] MPI startup(): I_MPI_INFO_CACHE1=0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29,0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29
[0] MPI startup(): I_MPI_INFO_CACHE2=0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29,0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29
[0] MPI startup(): I_MPI_INFO_CACHE3=0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_CACHES=3
[0] MPI startup(): I_MPI_INFO_CACHE_SHARE=2,2,32
[0] MPI startup(): I_MPI_INFO_CACHE_SIZE=32768,262144,31457280
[0] MPI startup(): I_MPI_INFO_CORE=0,1,2,3,4,5,8,9,10,11,12,13,0,1,2,3,4,5,8,9,10,11,12,13,0,1,2,3,4,5,8,9,10,11,12,13,0,1,2,3,4,5,8,9,10,11,12,13
[0] MPI startup(): I_MPI_INFO_C_NAME=Unknown
[0] MPI startup(): I_MPI_INFO_DESC=1342177285
[0] MPI startup(): I_MPI_INFO_FLGB=641
[0] MPI startup(): I_MPI_INFO_FLGC=2143216639
[0] MPI startup(): I_MPI_INFO_FLGCEXT=0
[0] MPI startup(): I_MPI_INFO_FLGD=-1075053569
[0] MPI startup(): I_MPI_INFO_FLGDEXT=0
[0] MPI startup(): I_MPI_INFO_LCPU=48
[0] MPI startup(): I_MPI_INFO_MODE=775
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_DIST=10,20,20,10
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] MPI startup(): I_MPI_INFO_PACK=0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_SERIAL=E5-2697 v2
[0] MPI startup(): I_MPI_INFO_SIGN=198372
[0] MPI startup(): I_MPI_INFO_STATE=0
[0] MPI startup(): I_MPI_INFO_THREAD=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_VEND=1
[0] MPI startup(): I_MPI_PIN_INFO=x0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35
[0] MPI startup(): I_MPI_PIN_MAPPING=2:0 0,1 12
[0] Bcast(): algo #0 is selected
[0] Bcast(): algo #0 is selected
[1] Bcast(): algo #0 is selected
 myid,no_surfaces,no_vertices           0        3104        1554
 myid,no_surfaces,no_vertices           1        3104        1554
[0] Bcast(): algo #0 is selected


[1] Bcast(): algo #0 is selected  -- HANGS!


 Sending Ctrl-C to processes as requested
[mpiexec@ Press Ctrl-C again to force abort
forrtl: error (69): process interrupted (SIGINT)
Image              PC                Routine            Line        Source             
a.out              0000000000477935  Unknown               Unknown  Unknown
a.out              00000000004756F7  Unknown               Unknown  Unknown
a.out              0000000000444FE4  Unknown               Unknown  Unknown
a.out              0000000000444DF6  Unknown               Unknown  Unknown
a.out              0000000000425EF6  Unknown               Unknown  Unknown
a.out              0000000000403D2E  Unknown               Unknown  Unknown
libpthread.so.0    00007F5279AAED10  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A1DF34F  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A2EAC29  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A2EAF65  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A1C5B0D  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A1C8AEA  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A1C7DF4  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A1CB5DB  Unknown               Unknown  Unknown
libmpi.so.12       00007F527A1CAFEE  Unknown               Unknown  Unknown
libmpifort.so.12   00007F527A8DF573  Unknown               Unknown  Unknown
a.out              00000000004032AC  Unknown               Unknown  Unknown
a.out              0000000000402F2E  Unknown               Unknown  Unknown
libc.so.6          00007F52793ECA40  Unknown               Unknown  Unknown
a.out              0000000000402E29  Unknown               Unknown  Unknown
forrtl: error (69): process interrupted (SIGINT)
Image              PC                Routine            Line        Source             
a.out              0000000000477935  Unknown               Unknown  Unknown
a.out              00000000004756F7  Unknown               Unknown  Unknown
a.out              0000000000444FE4  Unknown               Unknown  Unknown
a.out              0000000000444DF6  Unknown               Unknown  Unknown
a.out              0000000000425EF6  Unknown               Unknown  Unknown
a.out              0000000000403D2E  Unknown               Unknown  Unknown
libpthread.so.0    00007FB87DEB4D10  Unknown               Unknown  Unknown
libc.so.6          00007FB87D8D35A9  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E792512  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E7913CD  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E5E596E  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E6F0C29  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E6F126A  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E5CBD5D  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E5CEAEA  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E5CDDF4  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E5D15DB  Unknown               Unknown  Unknown
libmpi.so.12       00007FB87E5D0FEE  Unknown               Unknown  Unknown
libmpifort.so.12   00007FB87ECE5573  Unknown               Unknown  Unknown
a.out              00000000004032AC  Unknown               Unknown  Unknown
a.out              0000000000402F2E  Unknown               Unknown  Unknown
libc.so.6          00007FB87D7F2A40  Unknown               Unknown  Unknown
a.out              0000000000402E29  Unknown               Unknown  Unknown

~/myprojects/mpi_bcast_test$ ldd ./a.out
    linux-vdso.so.1 =>  (0x00007ffe27772000)
    libmpifort.so.12 => /opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f50fa8d9000)
    libmpi.so.12 => /opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/lib/libmpi.so.12 (0x00007f50fa117000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f50f9ef8000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f50f9cef000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f50f9ad1000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f50f97c9000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f50f93fe000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f50f91e7000)
    /lib64/ld-linux-x86-64.so.2 (0x000055d897fbe000)


Thanks for the help.

 

 

 

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Hi,

These are my outputs below. Hangs at [1] Bcast(): algo #0 is selected

Hope you can help. Thanks!

 @ -Precision-T7610:~$ which mpirun
/opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/bin/mpirun
 @ -Precision-T7610:~$ which mpiifort
/opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/bin/mpiifort
 @ -Precision-T7610:~$ mpirun -V
Intel(R) MPI Library for Linux* OS, Version 5.1.2 Build 20151015 (build id: 13147)
Copyright (C) 2003-2015, Intel Corporation. All rights reserved.
 @ -Precision-T7610:~$ cd myprojects/mpi_bcast_test/
 @ -Precision-T7610:~/myprojects/mpi_bcast_test$ mpiifort mpi_bcast_test.f90
 @ -Precision-T7610:~/myprojects/mpi_bcast_test$ mpirun -n 2 -genv I_MPI_DEBUG 1000 ./a.out
[0] MPI startup(): Intel(R) MPI Library, Version 5.1.2  Build 20151015 (build id: 13147)
[0] MPI startup(): Copyright (C) 2003-2015 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPID_nem_impi_create_numa_nodes_map(): Fetching extra numa information from /etc/ofed-mic.map
[1] MPID_nem_impi_create_numa_nodes_map(): Fetching extra numa information from /etc/ofed-mic.map
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[0] MPI startup(): Recognition mode: 2, selected platform: 16 own platform: 16
[1] MPI startup(): Recognition mode: 2, selected platform: 16 own platform: 16
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allgatherv: 3: 0-259847 & 0-2147483647
[0] MPI startup(): Allgatherv: 4: 0-2147483647 & 0-2147483647
[0] MPI startup(): Allreduce: 1: 0-32768 & 0-2147483647
[0] MPI startup(): Allreduce: 8: 32769-101386 & 0-2147483647
[0] MPI startup(): Allreduce: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-117964 & 0-2147483647
[0] MPI startup(): Alltoall: 4: 117965-3131275 & 0-2147483647
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Barrier: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Bcast: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gather: 3: 1-921 & 0-2147483647
[0] MPI startup(): Gather: 1: 922-3027 & 0-2147483647
[0] MPI startup(): Gather: 3: 3028-5071 & 0-2147483647
[0] MPI startup(): Gather: 2: 5072-11117 & 0-2147483647
[0] MPI startup(): Gather: 1: 11118-86016 & 0-2147483647
[0] MPI startup(): Gather: 3: 86017-283989 & 0-2147483647
[0] MPI startup(): Gather: 1: 283990-664950 & 0-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-2147483647
[0] MPI startup(): Gatherv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 1: 0-6 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatter: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 & 0-2147483647
[0] MPI startup(): Rank    Pid      Node name                 Pin cpu
[0] MPI startup(): 0       2825      -Precision-T7610  {0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35}
[0] MPI startup(): 1       2826      -Precision-T7610  {12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47}
[0] MPI startup(): Recognition=2 Platform(code=16 ippn=1 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Topology split mode = 1

| rank | node | space=1
|  0  |  0  |
|  1  |  0  |
[1] MPI startup(): Recognition=2 Platform(code=16 ippn=1 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[1] Bcast(): algo #0 is selected
[0] MPI startup(): I_MPI_DEBUG=1000
[0] MPI startup(): I_MPI_INFO_BRAND=Intel(R) Xeon(R)
[0] MPI startup(): I_MPI_INFO_CACHE1=0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29,0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29
[0] MPI startup(): I_MPI_INFO_CACHE2=0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29,0,1,2,3,4,5,8,9,10,11,12,13,16,17,18,19,20,21,24,25,26,27,28,29
[0] MPI startup(): I_MPI_INFO_CACHE3=0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_CACHES=3
[0] MPI startup(): I_MPI_INFO_CACHE_SHARE=2,2,32
[0] MPI startup(): I_MPI_INFO_CACHE_SIZE=32768,262144,31457280
[0] MPI startup(): I_MPI_INFO_CORE=0,1,2,3,4,5,8,9,10,11,12,13,0,1,2,3,4,5,8,9,10,11,12,13,0,1,2,3,4,5,8,9,10,11,12,13,0,1,2,3,4,5,8,9,10,11,12,13
[0] MPI startup(): I_MPI_INFO_C_NAME=Unknown
[0] MPI startup(): I_MPI_INFO_DESC=1342177285
[0] MPI startup(): I_MPI_INFO_FLGB=641
[0] MPI startup(): I_MPI_INFO_FLGC=2143216639
[0] MPI startup(): I_MPI_INFO_FLGCEXT=0
[0] MPI startup(): I_MPI_INFO_FLGD=-1075053569
[0] MPI startup(): I_MPI_INFO_FLGDEXT=0
[0] MPI startup(): I_MPI_INFO_LCPU=48
[0] MPI startup(): I_MPI_INFO_MODE=775
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_DIST=10,20,20,10
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] MPI startup(): I_MPI_INFO_PACK=0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_SERIAL=E5-2697 v2
[0] MPI startup(): I_MPI_INFO_SIGN=198372
[0] MPI startup(): I_MPI_INFO_STATE=0
[0] MPI startup(): I_MPI_INFO_THREAD=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_VEND=1
[0] MPI startup(): I_MPI_PIN_INFO=x0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35
[0] MPI startup(): I_MPI_PIN_MAPPING=2:0 0,1 12
[0] Bcast(): algo #0 is selected
[0] Bcast(): algo #0 is selected
[1] Bcast(): algo #0 is selected
 myid,no_surfaces,no_vertices           0        3104        1554
 myid,no_surfaces,no_vertices           1        3104        1554
[0] Bcast(): algo #0 is selected
[1] Bcast(): algo #0 is selected - HANGS!!!
^C[mpiexec@ -Precision-T7610] Sending Ctrl-C to processes as requested
[mpiexec@ -Precision-T7610] Press Ctrl-C again to force abort
forrtl: error (69): process interrupted (SIGINT)
Image              PC                Routine            Line        Source             
a.out              0000000000477935  Unknown               Unknown  Unknown
a.out              00000000004756F7  Unknown               Unknown  Unknown
a.out              0000000000444FE4  Unknown               Unknown  Unknown
a.out              0000000000444DF6  Unknown               Unknown  Unknown
a.out              0000000000425EF6  Unknown               Unknown  Unknown
a.out              0000000000403D2E  Unknown               Unknown  Unknown
libpthread.so.0    00007FD727AA8D10  Unknown               Unknown  Unknown
libc.so.6          00007FD7274B0A77  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7281D9598  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7282E4C29  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7282E4F65  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7281BFB0D  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7281C2AEA  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7281C1DF4  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7281C55DB  Unknown               Unknown  Unknown
libmpi.so.12       00007FD7281C4FEE  Unknown               Unknown  Unknown
libmpifort.so.12   00007FD7288D9573  Unknown               Unknown  Unknown
a.out              00000000004032AC  Unknown               Unknown  Unknown
a.out              0000000000402F2E  Unknown               Unknown  Unknown
libc.so.6          00007FD7273E6A40  Unknown               Unknown  Unknown
a.out              0000000000402E29  Unknown               Unknown  Unknown
forrtl: error (69): process interrupted (SIGINT)
Image              PC                Routine            Line        Source             
a.out              0000000000477935  Unknown               Unknown  Unknown
a.out              00000000004756F7  Unknown               Unknown  Unknown
a.out              0000000000444FE4  Unknown               Unknown  Unknown
a.out              0000000000444DF6  Unknown               Unknown  Unknown
a.out              0000000000425EF6  Unknown               Unknown  Unknown
a.out              0000000000403D2E  Unknown               Unknown  Unknown
libpthread.so.0    00007FD46B3BED10  Unknown               Unknown  Unknown
libc.so.6          00007FD46ADDD5A9  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BC9C512  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BC9B3CD  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BAEF96E  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BBFAC29  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BBFB26A  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BAD5D5D  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BAD8AEA  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BAD7DF4  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BADB5DB  Unknown               Unknown  Unknown
libmpi.so.12       00007FD46BADAFEE  Unknown               Unknown  Unknown
libmpifort.so.12   00007FD46C1EF573  Unknown               Unknown  Unknown
a.out              00000000004032AC  Unknown               Unknown  Unknown
a.out              0000000000402F2E  Unknown               Unknown  Unknown
libc.so.6          00007FD46ACFCA40  Unknown               Unknown  Unknown
a.out              0000000000402E29  Unknown               Unknown  Unknown

 @ -Precision-T7610:~/myprojects/mpi_bcast_test$ ldd ./a.out
    linux-vdso.so.1 =>  (0x00007ffc421e8000)
    libmpifort.so.12 => /opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f8aece20000)
    libmpi.so.12 => /opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/lib/libmpi.so.12 (0x00007f8aec65e000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8aec43f000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f8aec236000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8aec018000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8aebd10000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8aeb945000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8aeb72e000)
    /lib64/ld-linux-x86-64.so.2 (0x0000561b3e676000)
 @ -Precision-T7610:~/myprojects/mpi_bcast_test$

 

 

 

0 Kudos
James_T_Intel
Moderator
2,055 Views

I don't immediately see anything indicating why it is hanging.  How did you compile your program?  Try running this and send the output:

mpirun -n 2 IMB-MPI1 Bcast

 

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Hi James,

Btw, I tried the same code with MPICH and it worked.

Hereś the output:

#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 4.1 Update 1, MPI-1 part    
#------------------------------------------------------------
# Date                  : Wed Dec  2 09:47:15 2015
# Machine               : x86_64
# System                : Linux
# Release               : 4.2.0-18-generic
# Version               : #22-Ubuntu SMP Fri Nov 6 18:25:50 UTC 2015
# MPI Version           : 3.0
# MPI Thread Environment:

# New default behavior from Version 3.2 on:

# the number of iterations per message size is cut down
# dynamically when a certain run time (per message size sample)
# is expected to be exceeded. Time limit is defined by variable
# "SECS_PER_SAMPLE" (=> IMB_settings.h)
# or through the flag => -time
 


# Calling sequence was:

# IMB-MPI1 Bcast

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM  
#
#

# List of Benchmarks to run:

# Bcast

#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 2
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.97         0.99         0.98
            1         1000         2.14         2.19         2.17
            2         1000         2.17         2.19         2.18
            4         1000         2.07         2.08         2.08
            8         1000         2.08         2.16         2.12
           16         1000         2.05         2.05         2.05
           32         1000         1.96         2.01         1.99
           64         1000         2.14         2.15         2.14
          128         1000         2.07         2.11         2.09
          256         1000         2.14         2.20         2.17
          512         1000         2.56         2.58         2.57
         1024         1000         2.76         2.80         2.78
         2048         1000         3.44         3.53         3.48
         4096         1000         4.69         4.74         4.71
         8192         1000         6.94         6.99         6.96
        16384         1000        11.27        11.33        11.30
        32768         1000        18.76        18.82        18.79

It hanged at this stage

Thanks for the help.

 

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Btw, FYI

I tested:

1 compiling with mpiifort, running with a) intel and b) mpichś mpirun.

Both can´t work.

2.compiling with mpich & gfortran, running with a) intel and b) mpichś mpirun c) intel and d) mpichś mpiexec

a) reported error, b), c) and d) worked.

 

 

0 Kudos
Arvid_I_
Beginner
2,055 Views

I am facing the same problem as the opening poster on my local machine with Ubuntu 12.04 LTS. The which commands and mpirun -V report the same values as reported by the previous user.

The Bcast test shows exactly the same behaviour for me. However, it seems the problem is more general. Allreduce also hangs in the benchmarks, but freezes the whole system, only a reboot can recover from that state. Bcast however only hangs the benchmark process without affecting the rest of the system. 

On a Linux cluster with SUSE Linux Enterprise Server 11 SP4 with the same mpirun -V result both mpi benchmark tests finish without problems. Before updating my Parallel Studio XE Cluster Edition to the latest release, I had no problems with using Intel MPI on my local machine.

Please let me know which additional information I should provide.

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Hi Arvid,

Are you still running SUSE Linux Enterprise Server 11 SP4?

Did the problem occur when you upgrade to Parallel Studio XE Cluster Edition latest release?

In that case, maybe I should install an older ver. Which was the version that worked?

Thanks.

 

 

 

0 Kudos
John_D_6
New Contributor I
2,055 Views

this is probably unrelated, but have you checked the status of the allocate statement? Did it actually succeed or not? In your case, if the allocate fails, the program would still continue (because of STAT=status) and behaviour is unpredictable.

If you don't plan to check the status of the allocate, I suggest to remove it from the code. Without STAT=status, the program would be terminated by the Fortran runtime with a clear error message.

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Hi John,

I just tried, the allocation has no problem.

Thanks for also highlighting abt the STAT issue.

So do you mean I either

1. have the STAT included and check its value

or

2. no STAT and Fortran will check on its own.

Is that so?

 

0 Kudos
John_D_6
New Contributor I
2,055 Views

yes, that's basically correct. The Fortran standard has the following to say about it:

If the STAT= specifier is present, successful execution of the ALLOCATE statement causes the stat-variable to
become defined with a value of zero. If an error condition occurs during the execution of the ALLOCATE
statement, the stat-variable becomes defined with a processor-dependent positive integer value.


If an error condition occurs during execution of an ALLOCATE statement that does not contain the STAT=
specifier, execution of the executable program is terminated.

Basically, the optional STAT= argument gives you the possibility to program an alternative path in case of a memory allocation failure. Since there is no alternative in your code, it is better to leave it out.

0 Kudos
James_T_Intel
Moderator
2,055 Views

Try running with I_MPI_ADJUST_BCAST=1, and see if that makes a difference.

0 Kudos
Wee_Beng_T_
Beginner
2,055 Views

Hi,

Just tried ... same problem, it hangs there.

 

0 Kudos
Arvid_I_
Beginner
2,055 Views

@Wee Beng T

I previously used the "Initial Release" of the Parallel Studio 2016 version, which worked fine on my local machine with Ubuntu. As I said, the Suse Linux runs on a remote cluster that doesn't show any of these problems.

@James T.

Setting the environment variable produces the same result, but with one perhaps interesting difference: Depending on the algorithm choice, the test hangs at different stages.

For the algorithms specified by the values 2 (see logfile below) and 3 the last displayed line is 65536 bytes, all other algorithm choices freeze after 32768 bytes.

#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 4.1 Update 1, MPI-1 part    
#------------------------------------------------------------
# Date                  : Mon Dec  7 12:46:06 2015
# Machine               : x86_64
# System                : Linux
# Release               : 3.2.0-95-generic
# Version               : #135-Ubuntu SMP Tue Nov 10 13:33:29 UTC 2015
# MPI Version           : 3.0
# MPI Thread Environment:

# New default behavior from Version 3.2 on:

# the number of iterations per message size is cut down
# dynamically when a certain run time (per message size sample)
# is expected to be exceeded. Time limit is defined by variable
# "SECS_PER_SAMPLE" (=> IMB_settings.h)
# or through the flag => -time
 


# Calling sequence was:

# IMB-MPI1 Bcast

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM  
#
#

# List of Benchmarks to run:

# Bcast

#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 2
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.10         0.11         0.11
            1         1000         0.67         0.67         0.67
            2         1000         1.37         1.39         1.38
            4         1000         1.19         1.20         1.19
            8         1000         1.14         1.15         1.14
           16         1000         1.21         1.22         1.22
           32         1000         1.18         1.19         1.19
           64         1000         1.27         1.27         1.27
          128         1000         1.31         1.31         1.31
          256         1000         1.31         1.31         1.31
          512         1000         1.35         1.35         1.35
         1024         1000         1.49         1.49         1.49
         2048         1000         1.64         1.65         1.64
         4096         1000         2.02         2.02         2.02
         8192         1000         2.81         2.81         2.81
        16384         1000         4.31         4.31         4.31
        32768         1000         6.51         6.51         6.51
        65536          640        11.57        11.59        11.58

 

0 Kudos
Wee_beng_T_1
Beginner
2,055 Views

I just tested the different Intel mpi version. Both 5.1 initial release (l_mpi_p_5.1.0.038) and update 1 (l_mpi_p_5.1.1.109) work but update 2 (l_mpi_p_5.1.2.150) fail. So clearly the fault lies with update 2. My system is an Ubuntu 15.10, not sure if it matters. So maybe you people can look into the matter. Thanks.

0 Kudos
Mayada_M_
Beginner
2,055 Views

i have the same problem

0 Kudos
James_T_Intel
Moderator
2,055 Views

Please rerun with

I_MPI_DEBUG=1000
I_MPI_HYDRA_DEBUG=1

and attach the output as a file.  I'll get this submitted to our engineering team.

0 Kudos
Arvid_I_
Beginner
2,055 Views

Here's the output of the BCast test with the debug environment variables you suggested. Let me know if you need anything else.

0 Kudos
James_T_Intel
Moderator
2,055 Views

Thanks Arvid, I'm going to get this to our developers to see if they have any advice.

0 Kudos
Wee_Beng_T_
Beginner
1,913 Views

Hi,

Just saw the intel mpi update 3 for linux. Has the current problem been addressed? Thanks

 

0 Kudos
Reply