- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have been confused by the strange behaviour of intel mpi library for days. When I send small data, everything is fine. However, when I send large data, the following code hangs.
#include <mpi.h> #include <stdio.h> #include <stdlib.h> int main(){ int length = MSG_LENGTH; char* buf = malloc(length); int size, rank; MPI_Init(0, NULL); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); if(rank == 0){ MPI_Send(buf, length, MPI_BYTE, 1, 0, MPI_COMM_WORLD); printf("Sent"); }else{ MPI_Recv(buf, length, MPI_BYTE, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); printf("Received"); } free(buf); return 0; }
This is the test makefile:
default: compile test-small test-large compile: mpiicc mpi.c -DMSG_LENGTH=1024 -o mpi-small mpiicc mpi.c -DMSG_LENGTH=1048576 -o mpi-large test-small: compile @echo "Testing Recv/Send with small data" mpiexec.hydra -n 2 ./mpi-small @echo "Test done" test-large: compile @echo "Testing Recv/Send with large data" mpiexec.hydra -n 2 ./mpi-large @echo "Test done"
Thank you very much!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Haoyan,
Could you please try to run the following test scenario and provide its output:
mpirun -n 2 IMB-MPI1 pingpong
Also please specify OS and Intel MPI Library version (for example, with 'mpirun -V').
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Artem R. (Intel) wrote:
Hi Haoyan,
Could you please try to run the following test scenario and provide its output:
mpirun -n 2 IMB-MPI1 pingpong
Also please specify OS and Intel MPI Library version (for example, with 'mpirun -V').
The test scenario gives:
#------------------------------------------------------------ # Intel (R) MPI Benchmarks 4.1 Update 1, MPI-1 part #------------------------------------------------------------ # Date : Sat Feb 20 17:04:59 2016 # Machine : x86_64 # System : Linux # Release : 3.16.0-60-generic # Version : #80~14.04.1-Ubuntu SMP Wed Jan 20 13:37:48 UTC 2016 # MPI Version : 3.0 # MPI Thread Environment: # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # IMB-MPI1 pingpong # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 1.06 0.00 1 1000 1.23 0.78 2 1000 1.23 1.55 4 1000 1.23 3.10 8 1000 1.21 6.32 16 1000 1.21 12.63 32 1000 1.18 25.87 64 1000 1.32 46.08 128 1000 1.24 98.73 256 1000 1.23 198.01 512 1000 1.53 319.15 1024 1000 1.64 595.09 2048 1000 2.04 959.76 4096 1000 2.97 1315.46 8192 1000 4.52 1727.86 16384 1000 8.19 1907.81 32768 1000 15.59 2004.62 ***hangs here
And my mpi version is "Intel(R) MPI Library for Linux* OS, Version 5.1.2 Build 20151015 (build id: 13147)".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Haoyan,
As far as I see you use Ubuntu* OS 14.04.1 and the hang is a known issue (see the topic "MPI having bad performance in user mode, runs perfectly in root").
Could you please try the workaround:
export I_MPI_SHM_LMT=shm
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Artem,
Yes, setting I_MPI_SHM_LMT solves this problem. Thank you very much!
#------------------------------------------------------------ # Intel (R) MPI Benchmarks 4.1 Update 1, MPI-1 part #------------------------------------------------------------ # Date : Sat Feb 20 17:50:21 2016 # Machine : x86_64 # System : Linux # Release : 3.16.0-60-generic # Version : #80~14.04.1-Ubuntu SMP Wed Jan 20 13:37:48 UTC 2016 # MPI Version : 3.0 # MPI Thread Environment: # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # IMB-MPI1 pingpong # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 1.24 0.00 1 1000 1.31 0.73 2 1000 1.32 1.44 4 1000 1.31 2.91 8 1000 1.27 6.02 16 1000 1.27 12.02 32 1000 1.24 24.65 64 1000 1.34 45.55 128 1000 1.29 94.55 256 1000 1.30 188.08 512 1000 1.55 315.54 1024 1000 1.88 520.42 2048 1000 2.09 932.92 4096 1000 2.97 1314.82 8192 1000 4.47 1749.54 16384 1000 8.05 1940.14 32768 1000 15.92 1962.43 65536 640 18.32 3412.26 131072 320 33.20 3765.21 262144 160 58.60 4266.03 524288 80 116.81 4280.56 1048576 40 233.65 4279.90 2097152 20 470.89 4247.24 4194304 10 915.41 4369.64 # All processes entering MPI_Finalize
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just to update this problem is not resolved as of Intel 2017 update 1 (with Ubuntu 16.04 LTS).
Rather than hanging however I now get an error message after the 32768 row as follows:
Fatal error in MPI_Recv: Other MPI error, error stack: MPI_Recv(224)...................: MPI_Recv(buf=0x1639880, count=65536, MPI_BYTE, src=0, tag=MPI_ANY_TAG, comm=0x84000002, status=0x7fff751dbb50) failed PMPIDI_CH3I_Progress(623).......: fail failed pkt_RTS_handler(317)............: fail failed do_cts(662).....................: fail failed MPID_nem_lmt_dcp_start_recv(288): fail failed dcp_recv(154)...................: Internal MPI error! cannot read from remote process
The test passes with the workaround given above as before.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page