ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. MODULE_VERSION: Undefined variable. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored. [14] MPI startup(): shm and ofa data transfer modes [31] MPI startup(): shm and ofa data transfer modes [1] MPI startup(): shm and ofa data transfer modes [15] MPI startup(): shm and ofa data transfer modes [18] MPI startup(): shm and ofa data transfer modes [27] MPI startup(): shm and ofa data transfer modes [29] MPI startup(): shm and ofa data transfer modes [25] MPI startup(): shm and ofa data transfer modes [30] MPI startup(): shm and ofa data transfer modes [28] MPI startup(): shm and ofa data transfer modes [4] MPI startup(): shm and ofa data transfer modes [11] MPI startup(): shm and ofa data transfer modes [6] MPI startup(): shm and ofa data transfer modes [7] MPI startup(): shm and ofa data transfer modes [20] MPI startup(): shm and ofa data transfer modes [19] MPI startup(): shm and ofa data transfer modes [13] MPI startup(): shm and ofa data transfer modes [12] MPI startup(): shm and ofa data transfer modes [16] MPI startup(): shm and ofa data transfer modes [21] MPI startup(): shm and ofa data transfer modes [24] MPI startup(): shm and ofa data transfer modes [5] MPI startup(): shm and ofa data transfer modes [22] MPI startup(): shm and ofa data transfer modes [26] MPI startup(): shm and ofa data transfer modes [0] MPI startup(): shm and ofa data transfer modes [8] MPI startup(): shm and ofa data transfer modes [10] MPI startup(): shm and ofa data transfer modes [2] MPI startup(): shm and ofa data transfer modes [9] MPI startup(): shm and ofa data transfer modes [23] MPI startup(): shm and ofa data transfer modes [17] MPI startup(): shm and ofa data transfer modes [3] MPI startup(): shm and ofa data transfer modes [0] MPI startup(): Rank Pid Node name Pin cpu [0] MPI startup(): 0 4712 pershing-n0202 {0,16} [0] MPI startup(): 1 4713 pershing-n0202 {1,17} [0] MPI startup(): 2 4714 pershing-n0202 {2,18} [0] MPI startup(): 3 4715 pershing-n0202 {3,19} [0] MPI startup(): 4 4716 pershing-n0202 {4,20} [0] MPI startup(): 5 4717 pershing-n0202 {5,21} [0] MPI startup(): 6 4718 pershing-n0202 {6,22} [0] MPI startup(): 7 4719 pershing-n0202 {7,23} [0] MPI startup(): 8 4720 pershing-n0202 {8,24} [0] MPI startup(): 9 4721 pershing-n0202 {9,25} [0] MPI startup(): 10 4722 pershing-n0202 {10,26} [0] MPI startup(): 11 4723 pershing-n0202 {11,27} [0] MPI startup(): 12 4724 pershing-n0202 {12,28} [0] MPI startup(): 13 4725 pershing-n0202 {13,29} [0] MPI startup(): 14 4726 pershing-n0202 {14,30} [0] MPI startup(): 15 4727 pershing-n0202 {15,31} [0] MPI startup(): 16 4663 pershing-n0203 {0,16} [0] MPI startup(): 17 4664 pershing-n0203 {1,17} [0] MPI startup(): 18 4665 pershing-n0203 {2,18} [0] MPI startup(): 19 4666 pershing-n0203 {3,19} [0] MPI startup(): 20 4667 pershing-n0203 {4,20} [0] MPI startup(): 21 4668 pershing-n0203 {5,21} [0] MPI startup(): 22 4669 pershing-n0203 {6,22} [0] MPI startup(): 23 4670 pershing-n0203 {7,23} [0] MPI startup(): 24 4671 pershing-n0203 {8,24} [0] MPI startup(): 25 4672 pershing-n0203 {9,25} [0] MPI startup(): 26 4673 pershing-n0203 {10,26} [0] MPI startup(): 27 4674 pershing-n0203 {11,27} [0] MPI startup(): 28 4675 pershing-n0203 {12,28} [0] MPI startup(): 29 4676 pershing-n0203 {13,29} [0] MPI startup(): 30 4677 pershing-n0203 {14,30} [0] MPI startup(): 31 4678 pershing-n0203 {15,31} [0] MPI startup(): I_MPI_DEBUG=5 [0] MPI startup(): I_MPI_FABRICS=shm:ofa [0] MPI startup(): I_MPI_PIN_MAPPING=16:0 0,1 1,2 2,3 3,4 4,5 5,6 6,7 7,8 8,9 9,10 10,11 11,12 12,13 13,14 14,15 15 [0] MPI startup(): I_MPI_WAIT_MODE=enable #--------------------------------------------------- # Intel (R) MPI Benchmark Suite V3.2.3, MPI-1 part #--------------------------------------------------- # Date : Wed Jan 2 10:12:00 2013 # Machine : x86_64 # System : Linux # Release : 2.6.32-220.el6.x86_64 # Version : #1 SMP Wed Nov 9 08:03:13 EST 2011 # MPI Version : 2.1 # MPI Thread Environment: # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # ./IMB-MPI1.4.0.3 # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong # PingPing # Sendrecv # Exchange # Allreduce # Reduce # Reduce_scatter # Allgather # Allgatherv # Gather # Gatherv # Scatter # Scatterv # Alltoall # Alltoallv # Bcast # Barrier #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 0.34 0.00 1 1000 0.39 2.45 2 1000 0.36 5.25 4 1000 0.36 10.50 8 1000 0.46 16.73 16 1000 0.40 38.44 32 1000 1.00 30.37 64 1000 0.93 65.63 128 1000 1.00 122.49 256 1000 1.07 227.76 512 1000 1.13 431.34 1024 1000 1.35 724.44 2048 1000 1.34 1461.81 4096 1000 1.90 2057.00 8192 1000 3.01 2598.06 16384 1000 4.47 3491.99 32768 1000 6.74 4638.98 65536 640 12.30 5079.69 131072 320 18.87 6622.54 262144 160 31.95 7824.83 524288 80 55.49 9010.08 1048576 40 104.50 9569.21 2097152 20 205.22 9745.41 4194304 10 407.10 9825.60 #--------------------------------------------------- # Benchmarking PingPing # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 0.51 0.00 1 1000 0.55 1.74 2 1000 0.54 3.51 4 1000 0.56 6.86 8 1000 0.55 13.85 16 1000 0.55 27.90 32 1000 0.55 55.68 64 1000 0.55 110.34 128 1000 0.60 204.80 256 1000 0.61 400.78 512 1000 0.68 717.84 1024 1000 1.04 941.83 2048 1000 0.91 2155.79 4096 1000 1.24 3147.14 8192 1000 1.89 4122.80 16384 1000 3.66 4264.45 32768 1000 5.89 5310.00 65536 640 22.91 2727.78 131072 320 37.17 3363.31 262144 160 63.76 3921.19 524288 80 113.85 4391.71 1048576 40 208.63 4793.22 2097152 20 409.70 4881.64 4194304 10 805.09 4968.38 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 0.47 0.47 0.47 0.00 1 1000 0.51 0.51 0.51 3.73 2 1000 0.52 0.52 0.52 7.35 4 1000 0.50 0.50 0.50 15.26 8 1000 0.52 0.52 0.52 29.12 16 1000 0.50 0.50 0.50 60.92 32 1000 0.51 0.51 0.51 120.81 64 1000 0.52 0.52 0.52 235.73 128 1000 0.54 0.54 0.54 453.10 256 1000 0.56 0.56 0.56 865.60 512 1000 0.75 0.76 0.76 1292.11 1024 1000 0.71 0.71 0.71 2747.15 2048 1000 0.86 0.86 0.86 4520.97 4096 1000 1.22 1.22 1.22 6398.75 8192 1000 1.88 1.88 1.88 8289.40 16384 1000 3.21 3.21 3.21 9728.49 32768 1000 5.77 5.77 5.77 10824.79 65536 640 21.94 21.94 21.94 5697.23 131072 320 36.44 36.45 36.45 6859.32 262144 160 63.58 63.61 63.60 7860.02 524288 80 112.87 112.92 112.90 8855.52 1048576 40 208.08 208.20 208.14 9606.19 2097152 20 408.40 408.55 408.48 9790.63 4194304 10 803.30 803.49 803.40 9956.51 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 3.04 3.04 3.04 0.00 1 1000 3.18 3.18 3.18 0.60 2 1000 3.31 3.31 3.31 1.15 4 1000 3.55 3.55 3.55 2.15 8 1000 2.90 2.90 2.90 5.26 16 1000 3.33 3.33 3.33 9.17 32 1000 3.41 3.41 3.41 17.90 64 1000 3.22 3.22 3.22 37.90 128 1000 3.44 3.45 3.45 70.81 256 1000 3.63 3.63 3.63 134.44 512 1000 3.34 3.34 3.34 292.03 1024 1000 3.89 3.89 3.89 501.68 2048 1000 2.43 2.55 2.52 1532.50 4096 1000 1.84 1.84 1.84 4234.69 8192 1000 2.15 2.15 2.15 7277.73 16384 1000 4.09 4.09 4.09 7642.24 32768 1000 7.22 7.22 7.22 8660.19 65536 640 20.92 20.92 20.92 5975.00 131072 320 35.76 35.78 35.77 6987.60 262144 160 62.69 62.76 62.72 7966.58 524288 80 113.99 114.29 114.20 8749.75 1048576 40 214.07 214.17 214.12 9338.31 2097152 20 420.75 421.00 420.87 9501.20 4194304 10 1290.92 1297.40 1294.18 6166.17 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 3.48 3.48 3.48 0.00 1 1000 3.52 3.52 3.52 0.54 2 1000 3.78 3.79 3.78 1.01 4 1000 3.84 3.84 3.84 1.98 8 1000 3.45 3.46 3.46 4.41 16 1000 3.61 3.62 3.62 8.43 32 1000 3.62 3.62 3.62 16.86 64 1000 3.35 3.35 3.35 36.39 128 1000 3.33 3.34 3.34 73.16 256 1000 3.82 3.82 3.82 127.69 512 1000 3.45 3.57 3.49 273.78 1024 1000 4.00 4.00 4.00 488.40 2048 1000 4.05 4.06 4.05 963.08 4096 1000 4.27 4.27 4.27 1828.78 8192 1000 4.77 4.77 4.77 3272.87 16384 1000 6.21 6.22 6.21 5025.77 32768 1000 9.28 9.29 9.28 6730.62 65536 640 22.52 22.55 22.54 5543.62 131072 320 38.05 38.13 38.07 6557.31 262144 160 65.86 65.97 65.91 7578.64 524288 80 122.93 125.13 123.52 7992.00 1048576 40 242.32 245.65 244.28 8141.71 2097152 20 968.25 975.00 972.23 4102.56 4194304 10 2495.10 2575.52 2536.15 3106.17 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 6.73 6.88 6.84 0.00 1 1000 6.83 6.87 6.86 0.28 2 1000 6.50 6.63 6.53 0.58 4 1000 6.74 6.87 6.82 1.11 8 1000 6.70 6.72 6.71 2.27 16 1000 6.54 6.67 6.62 4.58 32 1000 6.76 6.79 6.78 8.99 64 1000 6.40 6.49 6.46 18.80 128 1000 6.87 7.01 6.96 34.81 256 1000 6.48 6.50 6.49 75.14 512 1000 6.18 6.21 6.20 157.23 1024 1000 6.77 6.79 6.78 287.73 2048 1000 7.35 7.48 7.38 522.30 4096 1000 7.84 7.94 7.91 983.94 8192 1000 9.51 9.52 9.52 1641.15 16384 1000 12.13 12.16 12.14 2570.74 32768 1000 17.41 17.46 17.43 3579.39 65536 640 37.18 37.44 37.30 3338.75 131072 320 62.17 62.67 62.39 3989.02 262144 160 112.40 114.29 113.38 4374.93 524288 80 211.81 218.69 215.32 4572.76 1048576 40 332.93 349.47 342.04 5722.89 2097152 20 1169.94 1233.49 1209.45 3242.82 4194304 10 2417.21 2680.71 2591.55 2984.29 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 32 #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 6.16 6.38 6.30 0.00 1 1000 6.30 6.43 6.40 0.30 2 1000 6.45 6.48 6.47 0.59 4 1000 6.09 6.27 6.20 1.22 8 1000 6.36 6.40 6.39 2.38 16 1000 6.48 6.52 6.49 4.68 32 1000 6.45 6.58 6.47 9.27 64 1000 6.04 6.40 6.23 19.08 128 1000 6.48 6.74 6.63 36.21 256 1000 6.13 6.32 6.20 77.29 512 1000 6.03 6.16 6.12 158.43 1024 1000 6.57 6.74 6.69 289.91 2048 1000 6.70 6.86 6.77 569.68 4096 1000 7.48 7.53 7.51 1038.08 8192 1000 8.40 8.66 8.58 1803.66 16384 1000 12.91 13.25 13.05 2359.19 32768 1000 30.34 30.70 30.54 2036.02 65536 640 45.26 45.81 45.47 2728.42 131072 320 79.64 81.28 80.40 3075.62 262144 160 135.83 141.48 138.49 3534.05 524288 80 231.40 248.96 239.43 4016.67 1048576 40 404.75 465.32 433.55 4298.10 2097152 20 1043.40 1239.80 1163.93 3226.33 4194304 10 2229.91 2726.98 2507.48 2933.64 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 1.15 1.15 1.15 0.00 1 1000 1.25 1.25 1.25 3.04 2 1000 1.25 1.25 1.25 6.11 4 1000 1.28 1.28 1.28 11.92 8 1000 1.21 1.22 1.22 25.10 16 1000 1.22 1.22 1.22 49.90 32 1000 1.22 1.22 1.22 100.31 64 1000 1.23 1.23 1.23 199.14 128 1000 1.27 1.27 1.27 384.46 256 1000 1.31 1.31 1.31 746.63 512 1000 1.56 1.56 1.56 1253.75 1024 1000 1.60 1.60 1.60 2435.20 2048 1000 1.85 1.85 1.85 4216.16 4096 1000 2.55 2.55 2.55 6127.15 8192 1000 3.72 3.72 3.72 8403.13 16384 1000 6.22 6.22 6.22 10051.15 32768 1000 11.13 11.13 11.13 11233.94 65536 640 44.50 44.51 44.50 5617.21 131072 320 74.08 74.08 74.08 6749.09 262144 160 127.30 127.33 127.31 7853.58 524288 80 225.41 225.45 225.43 8871.20 1048576 40 420.80 421.01 420.90 9501.06 2097152 20 810.75 811.10 810.93 9863.15 4194304 10 1895.69 1896.00 1895.84 8438.82 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 4.82 4.82 4.82 0.00 1 1000 4.89 4.89 4.89 0.78 2 1000 4.99 4.99 4.99 1.53 4 1000 5.15 5.16 5.16 2.96 8 1000 4.82 4.82 4.82 6.33 16 1000 4.94 4.94 4.94 12.35 32 1000 5.10 5.10 5.10 23.94 64 1000 4.50 4.50 4.50 54.24 128 1000 4.70 4.70 4.70 103.80 256 1000 4.55 4.55 4.55 214.72 512 1000 4.50 4.51 4.50 433.53 1024 1000 4.93 4.93 4.93 792.19 2048 1000 4.64 4.64 4.64 1684.04 4096 1000 2.75 2.75 2.75 5680.01 8192 1000 4.01 4.01 4.01 7785.22 16384 1000 6.82 6.83 6.83 9154.99 32768 1000 11.81 11.90 11.86 10499.83 65536 640 44.98 44.99 44.98 5557.12 131072 320 74.59 74.61 74.60 6701.30 262144 160 127.02 127.08 127.05 7869.33 524288 80 230.21 230.30 230.26 8684.42 1048576 40 428.83 429.00 428.93 9324.04 2097152 20 1024.04 1024.70 1024.39 7807.17 4194304 10 2812.41 2819.40 2815.90 5674.98 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 6.42 6.43 6.42 0.00 1 1000 6.32 6.34 6.33 0.60 2 1000 6.45 6.45 6.45 1.18 4 1000 6.43 6.44 6.44 2.37 8 1000 6.37 6.39 6.39 4.78 16 1000 6.53 6.53 6.53 9.34 32 1000 5.94 5.95 5.94 20.53 64 1000 6.44 6.55 6.49 37.27 128 1000 6.06 6.07 6.07 80.46 256 1000 6.20 6.20 6.20 157.48 512 1000 6.39 6.40 6.39 305.36 1024 1000 6.27 6.29 6.29 620.84 2048 1000 6.55 6.55 6.55 1192.56 4096 1000 6.38 6.41 6.40 2438.73 8192 1000 6.58 6.58 6.58 4747.09 16384 1000 8.04 8.04 8.04 7773.68 32768 1000 12.48 12.49 12.49 10009.70 65536 640 49.08 49.12 49.10 5089.67 131072 320 81.47 81.61 81.57 6126.76 262144 160 136.44 137.07 136.84 7295.31 524288 80 259.38 260.24 259.88 7685.22 1048576 40 672.58 674.38 673.75 5931.38 2097152 20 2093.95 2170.75 2140.14 3685.36 4194304 10 4793.10 4978.49 4893.63 3213.83 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 14.80 14.82 14.81 0.00 1 1000 14.80 14.83 14.82 0.26 2 1000 14.19 14.27 14.25 0.53 4 1000 14.49 14.51 14.50 1.05 8 1000 14.51 14.53 14.52 2.10 16 1000 14.08 14.10 14.09 4.33 32 1000 14.60 14.63 14.62 8.35 64 1000 13.72 13.76 13.74 17.75 128 1000 13.57 13.61 13.59 35.87 256 1000 13.81 13.84 13.82 70.57 512 1000 13.54 13.74 13.71 142.19 1024 1000 13.86 14.02 13.91 278.64 2048 1000 13.91 13.96 13.94 559.72 4096 1000 14.89 14.92 14.91 1047.25 8192 1000 15.52 15.53 15.53 2011.60 16384 1000 19.80 19.84 19.82 3150.20 32768 1000 29.20 29.35 29.29 4258.94 65536 640 68.46 68.62 68.55 3643.32 131072 320 109.18 109.66 109.42 4559.58 262144 160 171.57 173.62 172.50 5759.73 524288 80 360.47 367.50 364.33 5442.16 1048576 40 812.60 847.40 826.72 4720.33 2097152 20 2227.04 2320.55 2274.05 3447.46 4194304 10 4854.89 5493.19 5174.68 2912.70 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 32 #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 12.34 12.50 12.45 0.00 1 1000 12.53 12.66 12.63 0.30 2 1000 12.85 12.87 12.86 0.59 4 1000 12.04 12.19 12.14 1.25 8 1000 12.11 12.25 12.16 2.49 16 1000 12.77 12.87 12.84 4.74 32 1000 12.05 12.20 12.09 10.00 64 1000 11.87 12.03 11.98 20.30 128 1000 12.12 12.18 12.15 40.10 256 1000 12.33 12.44 12.38 78.52 512 1000 11.96 12.02 11.99 162.53 1024 1000 12.16 12.24 12.20 319.03 2048 1000 13.40 13.46 13.43 580.25 4096 1000 14.89 15.08 15.03 1036.21 8192 1000 17.79 17.89 17.84 1746.39 16384 1000 26.01 26.16 26.08 2389.23 32768 1000 49.96 50.31 50.12 2484.65 65536 640 79.90 80.34 80.15 3111.75 131072 320 137.16 139.11 138.26 3594.28 262144 160 224.13 229.18 226.83 4363.36 524288 80 402.00 421.01 411.45 4750.46 1048576 40 822.45 910.68 865.74 4392.34 2097152 20 2115.05 2328.65 2219.06 3435.47 4194304 10 4458.00 5419.40 4982.49 2952.36 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 0.73 0.73 0.73 8 1000 0.73 0.73 0.73 16 1000 0.73 0.73 0.73 32 1000 0.71 0.72 0.71 64 1000 0.73 0.73 0.73 128 1000 0.77 0.77 0.77 256 1000 0.83 0.83 0.83 512 1000 0.89 0.89 0.89 1024 1000 0.98 0.98 0.98 2048 1000 1.22 1.22 1.22 4096 1000 1.72 1.72 1.72 8192 1000 3.33 3.33 3.33 16384 1000 5.31 5.31 5.31 32768 1000 9.68 9.68 9.68 65536 640 17.73 17.73 17.73 131072 320 62.28 62.29 62.29 262144 160 106.48 106.49 106.48 524288 80 188.15 188.16 188.16 1048576 40 348.07 348.13 348.10 2097152 20 675.95 676.41 676.18 4194304 10 1501.68 1502.11 1501.89 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 4.22 4.22 4.22 8 1000 5.17 5.17 5.17 16 1000 5.20 5.20 5.20 32 1000 5.52 5.52 5.52 64 1000 4.44 4.44 4.44 128 1000 4.62 4.63 4.62 256 1000 4.84 4.84 4.84 512 1000 4.71 4.71 4.71 1024 1000 5.39 5.39 5.39 2048 1000 4.87 4.87 4.87 4096 1000 4.87 4.87 4.87 8192 1000 10.80 10.80 10.80 16384 1000 12.84 12.85 12.84 32768 1000 18.96 18.96 18.96 65536 640 30.89 30.89 30.89 131072 320 83.83 83.84 83.84 262144 160 165.09 165.11 165.10 524288 80 283.72 283.79 283.75 1048576 40 515.03 515.22 515.14 2097152 20 1124.35 1124.60 1124.45 4194304 10 3554.70 3560.59 3557.54 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 10.93 10.94 10.94 8 1000 12.09 12.09 12.09 16 1000 12.05 12.05 12.05 32 1000 12.03 12.03 12.03 64 1000 11.62 11.63 11.62 128 1000 12.02 12.03 12.03 256 1000 11.88 11.88 11.88 512 1000 11.88 11.88 11.88 1024 1000 12.36 12.37 12.36 2048 1000 9.84 9.84 9.84 4096 1000 13.41 13.42 13.41 8192 1000 22.01 22.01 22.01 16384 1000 25.54 25.54 25.54 32768 1000 31.18 31.19 31.19 65536 640 44.83 44.83 44.83 131072 320 104.98 105.00 104.99 262144 160 197.80 197.97 197.89 524288 80 356.21 356.60 356.40 1048576 40 867.62 869.15 868.70 2097152 20 2945.24 2951.49 2948.06 4194304 10 6865.22 6925.70 6896.83 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 39.58 39.58 39.58 8 1000 39.38 39.39 39.39 16 1000 39.30 39.31 39.31 32 1000 39.32 39.45 39.39 64 1000 38.84 38.97 38.94 128 1000 37.95 37.97 37.96 256 1000 37.06 37.08 37.07 512 1000 37.25 37.27 37.26 1024 1000 37.86 37.87 37.87 2048 1000 38.40 38.42 38.41 4096 1000 41.68 41.69 41.68 8192 1000 68.09 68.11 68.10 16384 1000 73.50 73.51 73.51 32768 1000 81.90 81.90 81.90 65536 640 94.29 94.31 94.30 131072 320 160.52 160.94 160.64 262144 160 281.89 281.95 281.92 524288 80 478.70 478.88 478.79 1048576 40 1076.38 1077.10 1076.71 2097152 20 3262.25 3271.56 3266.08 4194304 10 8877.11 8969.19 8929.75 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 32 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 29.42 29.43 29.43 8 1000 30.20 30.20 30.20 16 1000 53.03 53.05 53.04 32 1000 31.14 31.14 31.14 64 1000 53.37 53.39 53.38 128 1000 32.29 32.29 32.29 256 1000 33.46 33.46 33.46 512 1000 33.42 33.42 33.42 1024 1000 35.86 35.96 35.94 2048 1000 94.75 94.75 94.75 4096 1000 98.96 98.97 98.97 8192 1000 103.22 103.35 103.29 16384 1000 108.97 109.07 108.99 32768 1000 123.93 123.94 123.93 65536 640 153.35 153.37 153.36 131072 320 253.89 254.26 253.93 262144 160 404.70 404.94 404.77 524288 80 679.73 681.55 680.16 1048576 40 1971.63 1994.03 1984.77 2097152 20 4645.99 4707.91 4690.77 4194304 10 11416.48 11651.61 11587.21 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 1.36 1.36 1.36 8 1000 1.23 1.23 1.23 16 1000 1.36 1.36 1.36 32 1000 1.27 1.27 1.27 64 1000 1.34 1.34 1.34 128 1000 1.38 1.38 1.38 256 1000 1.50 1.50 1.50 512 1000 1.42 1.42 1.42 1024 1000 1.79 1.79 1.79 2048 1000 2.19 2.19 2.19 4096 1000 2.45 2.45 2.45 8192 1000 5.36 5.36 5.36 16384 1000 7.77 7.77 7.77 32768 1000 11.96 11.96 11.96 65536 640 21.73 21.74 21.74 131072 320 52.31 52.32 52.32 262144 160 88.54 88.55 88.55 524288 80 104.20 104.75 104.48 1048576 40 194.55 199.37 196.96 2097152 20 367.61 375.00 371.30 4194304 10 809.60 844.81 827.21 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 3.42 3.53 3.50 8 1000 3.33 3.34 3.34 16 1000 3.29 3.29 3.29 32 1000 3.48 3.48 3.48 64 1000 3.38 3.38 3.38 128 1000 3.46 3.46 3.46 256 1000 3.74 3.74 3.74 512 1000 3.78 3.79 3.78 1024 1000 3.42 3.42 3.42 2048 1000 4.00 4.00 4.00 4096 1000 5.06 5.07 5.07 8192 1000 9.89 9.89 9.89 16384 1000 13.60 13.61 13.61 32768 1000 20.40 20.40 20.40 65536 640 33.68 33.69 33.68 131072 320 71.33 71.38 71.35 262144 160 130.70 130.83 130.76 524288 80 170.81 172.14 171.77 1048576 40 325.60 332.98 331.07 2097152 20 731.05 750.20 745.26 4194304 10 1717.62 1788.71 1770.81 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 3.97 3.98 3.97 8 1000 3.46 3.47 3.47 16 1000 3.55 3.56 3.56 32 1000 3.95 3.96 3.96 64 1000 4.32 4.34 4.33 128 1000 4.29 4.29 4.29 256 1000 4.51 4.52 4.52 512 1000 4.82 4.83 4.83 1024 1000 5.63 5.63 5.63 2048 1000 6.51 6.51 6.51 4096 1000 8.11 8.11 8.11 8192 1000 17.09 17.10 17.09 16384 1000 21.39 21.40 21.39 32768 1000 28.88 28.89 28.89 65536 640 44.80 44.98 44.89 131072 320 87.56 87.63 87.58 262144 160 163.76 163.98 163.84 524288 80 199.50 202.07 201.54 1048576 40 487.63 495.15 493.83 2097152 20 1351.55 1394.55 1386.13 4194304 10 2731.80 2858.69 2835.16 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 5.31 5.36 5.33 8 1000 5.46 5.48 5.47 16 1000 6.24 6.26 6.25 32 1000 6.43 6.48 6.45 64 1000 7.07 7.12 7.11 128 1000 6.38 6.40 6.39 256 1000 6.74 6.76 6.75 512 1000 7.25 7.37 7.35 1024 1000 8.14 8.15 8.14 2048 1000 10.34 10.36 10.35 4096 1000 14.21 14.31 14.26 8192 1000 44.28 44.31 44.29 16384 1000 47.27 47.28 47.27 32768 1000 54.42 54.44 54.42 65536 640 72.15 72.18 72.16 131072 320 124.62 124.72 124.65 262144 160 217.51 218.54 218.02 524288 80 280.40 285.21 283.32 1048576 40 628.90 648.47 643.64 2097152 20 1634.05 1911.06 1854.29 4194304 10 3274.61 3695.70 3573.98 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 32 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.10 0.08 4 1000 7.71 7.88 7.81 8 1000 8.27 8.45 8.39 16 1000 7.86 8.04 7.96 32 1000 7.97 8.16 8.09 64 1000 8.63 8.81 8.74 128 1000 8.87 9.02 8.96 256 1000 8.95 9.22 9.14 512 1000 9.76 9.93 9.86 1024 1000 10.40 10.64 10.55 2048 1000 12.05 12.30 12.20 4096 1000 15.12 15.50 15.36 8192 1000 21.19 21.68 21.49 16384 1000 67.49 67.55 67.51 32768 1000 35.82 36.24 36.06 65536 640 60.14 60.63 60.44 131072 320 174.35 174.59 174.46 262144 160 196.24 201.99 199.55 524288 80 370.50 379.85 375.51 1048576 40 686.87 739.55 726.28 2097152 20 1995.60 2188.40 2138.67 4194304 10 3187.01 4041.22 3875.25 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.12 0.12 0.12 4 1000 0.76 0.87 0.82 8 1000 1.00 1.00 1.00 16 1000 1.01 1.01 1.01 32 1000 1.14 1.14 1.14 64 1000 1.00 1.00 1.00 128 1000 1.06 1.06 1.06 256 1000 1.04 1.04 1.04 512 1000 1.14 1.14 1.14 1024 1000 1.27 1.27 1.27 2048 1000 1.40 1.40 1.40 4096 1000 1.74 1.74 1.74 8192 1000 2.49 2.49 2.49 16384 1000 4.06 4.06 4.06 32768 1000 7.08 7.08 7.08 65536 640 13.34 13.34 13.34 131072 320 44.23 44.25 44.24 262144 160 77.79 77.79 77.79 524288 80 159.67 159.70 159.69 1048576 40 195.85 195.90 195.88 2097152 20 561.36 562.20 561.78 4194304 10 1181.98 1184.70 1183.34 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.12 0.13 0.12 4 1000 1.50 2.01 1.75 8 1000 1.58 2.01 1.80 16 1000 4.82 4.82 4.82 32 1000 4.31 4.31 4.31 64 1000 4.30 4.30 4.30 128 1000 5.27 5.28 5.27 256 1000 5.83 5.83 5.83 512 1000 5.61 5.61 5.61 1024 1000 5.66 5.66 5.66 2048 1000 5.54 5.54 5.54 4096 1000 5.68 5.69 5.69 8192 1000 6.00 6.00 6.00 16384 1000 8.42 8.42 8.42 32768 1000 12.44 12.44 12.44 65536 640 20.30 20.31 20.31 131072 320 55.67 55.68 55.67 262144 160 106.68 106.71 106.70 524288 80 232.16 232.51 232.37 1048576 40 278.30 278.45 278.37 2097152 20 946.65 952.95 950.35 4194304 10 2300.91 2330.80 2318.48 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.13 0.13 0.13 4 1000 2.97 4.52 3.63 8 1000 2.94 3.85 3.35 16 1000 0.77 9.94 5.39 32 1000 10.01 10.01 10.01 64 1000 12.38 12.38 12.38 128 1000 12.28 12.28 12.28 256 1000 12.56 12.56 12.56 512 1000 11.71 11.73 11.72 1024 1000 12.58 12.58 12.58 2048 1000 12.63 12.64 12.64 4096 1000 12.75 12.76 12.76 8192 1000 13.55 13.55 13.55 16384 1000 14.70 14.70 14.70 32768 1000 18.85 18.85 18.85 65536 640 27.53 27.53 27.53 131072 320 65.55 65.56 65.55 262144 160 121.92 122.66 122.29 524288 80 263.75 264.33 264.07 1048576 40 391.45 392.43 391.89 2097152 20 1835.55 1850.10 1844.04 4194304 10 3524.90 3572.80 3551.68 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.14 0.15 0.15 4 1000 6.08 11.13 7.99 8 1000 4.62 17.53 11.23 16 1000 2.54 26.68 8.94 32 1000 1.12 16.43 8.82 64 1000 34.48 34.50 34.49 128 1000 38.42 38.42 38.42 256 1000 36.73 36.76 36.75 512 1000 36.61 36.64 36.62 1024 1000 36.78 36.79 36.79 2048 1000 35.41 35.41 35.41 4096 1000 36.03 36.04 36.04 8192 1000 36.22 36.22 36.22 16384 1000 38.94 38.94 38.94 32768 1000 42.58 42.61 42.60 65536 640 52.68 52.71 52.69 131072 320 94.40 94.42 94.41 262144 160 159.87 160.62 160.06 524288 80 454.35 455.44 454.84 1048576 40 814.53 815.27 814.86 2097152 20 2336.45 2356.15 2346.72 4194304 10 4689.10 4762.82 4728.05 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 32 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.17 0.20 0.18 4 1000 1.47 4.88 2.76 8 1000 15.23 15.51 15.38 16 1000 15.25 15.54 15.41 32 1000 1.59 19.72 11.23 64 1000 16.83 20.99 18.92 128 1000 45.90 46.03 46.01 256 1000 55.79 55.93 55.83 512 1000 56.38 56.42 56.39 1024 1000 54.58 54.61 54.60 2048 1000 54.40 54.42 54.41 4096 1000 56.69 56.80 56.75 8192 1000 58.44 58.58 58.55 16384 1000 61.91 61.92 61.92 32768 1000 70.29 70.32 70.31 65536 640 86.76 86.80 86.78 131072 320 138.86 139.00 138.93 262144 160 216.83 217.93 217.41 524288 80 724.05 725.67 724.80 1048576 40 1800.07 1803.92 1801.34 2097152 20 2734.99 2808.65 2753.51 4194304 10 4960.99 5137.71 5013.92 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 1 1000 0.71 0.71 0.71 2 1000 0.77 0.77 0.77 4 1000 0.74 0.74 0.74 8 1000 0.77 0.77 0.77 16 1000 0.72 0.72 0.72 32 1000 0.71 0.71 0.71 64 1000 0.74 0.74 0.74 128 1000 0.77 0.77 0.77 256 1000 0.80 0.80 0.80 512 1000 0.85 0.86 0.86 1024 1000 0.94 0.94 0.94 2048 1000 1.11 1.11 1.11 4096 1000 1.61 1.61 1.61 8192 1000 2.50 2.50 2.50 16384 1000 4.17 4.17 4.17 32768 1000 7.41 7.41 7.41 65536 640 25.83 25.83 25.83 131072 320 47.32 47.32 47.32 262144 160 81.80 81.81 81.81 524288 80 151.12 151.19 151.16 1048576 40 289.77 289.85 289.81 2097152 20 571.36 571.49 571.42 4194304 10 1907.40 1911.38 1909.39 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 1 1000 4.13 4.14 4.14 2 1000 3.98 3.98 3.98 4 1000 4.81 4.82 4.82 8 1000 5.16 5.16 5.16 16 1000 4.26 4.26 4.26 32 1000 3.52 3.52 3.52 64 1000 3.29 3.30 3.30 128 1000 4.65 4.65 4.65 256 1000 5.03 5.03 5.03 512 1000 4.96 4.96 4.96 1024 1000 5.28 5.28 5.28 2048 1000 4.01 4.01 4.01 4096 1000 4.40 4.40 4.40 8192 1000 8.08 8.08 8.08 16384 1000 12.69 12.80 12.72 32768 1000 33.57 33.59 33.58 65536 640 73.44 73.44 73.44 131072 320 121.18 121.20 121.19 262144 160 210.82 210.88 210.85 524288 80 380.72 380.83 380.78 1048576 40 1011.40 1012.65 1012.06 2097152 20 3158.30 3161.86 3159.90 4194304 10 6139.80 6146.88 6143.30 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 1 1000 9.95 9.95 9.95 2 1000 10.34 10.35 10.34 4 1000 11.75 11.75 11.75 8 1000 11.39 11.39 11.39 16 1000 11.25 11.36 11.27 32 1000 11.33 11.34 11.33 64 1000 10.81 10.81 10.81 128 1000 11.62 11.62 11.62 256 1000 11.47 11.47 11.47 512 1000 11.82 11.93 11.87 1024 1000 12.24 12.25 12.24 2048 1000 13.36 13.37 13.37 4096 1000 14.92 14.93 14.92 8192 1000 19.25 19.26 19.26 16384 1000 44.93 44.93 44.93 32768 1000 82.77 82.78 82.78 65536 640 174.47 174.48 174.47 131072 320 277.95 278.09 278.03 262144 160 613.72 613.99 613.82 524288 80 1821.88 1823.11 1822.40 1048576 40 4159.65 4165.77 4162.63 2097152 20 9511.21 9530.15 9521.38 4194304 10 19709.49 19730.09 19719.31 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 1 1000 32.76 32.77 32.77 2 1000 37.09 37.09 37.09 4 1000 36.92 36.94 36.92 8 1000 36.38 36.39 36.38 16 1000 34.80 34.81 34.80 32 1000 35.37 35.38 35.37 64 1000 34.90 34.91 34.90 128 1000 35.08 35.20 35.11 256 1000 36.90 36.92 36.91 512 1000 39.78 39.80 39.79 1024 1000 45.75 45.77 45.76 2048 1000 48.56 48.57 48.56 4096 1000 56.55 56.56 56.55 8192 1000 126.88 126.93 126.91 16384 1000 207.07 207.13 207.10 32768 1000 326.47 326.57 326.53 65536 640 582.06 582.18 582.12 131072 320 1053.03 1053.58 1053.28 262144 160 2133.94 2135.97 2134.70 524288 80 4073.26 4076.69 4075.14 1048576 40 9179.70 9208.57 9194.29 2097152 20 20038.55 20219.70 20142.10 4194304 10 41651.70 41915.58 41781.73 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 32 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 1 1000 44.62 44.67 44.66 2 1000 36.48 36.49 36.49 4 1000 52.76 52.78 52.77 8 1000 37.84 37.85 37.84 16 1000 51.47 51.49 51.48 32 1000 52.84 52.86 52.85 64 1000 57.07 57.09 57.08 128 1000 67.88 67.90 67.89 256 1000 86.80 86.85 86.83 512 1000 110.60 110.66 110.63 1024 1000 165.81 165.94 165.86 2048 1000 285.48 285.70 285.62 4096 1000 2025.32 2025.58 2025.46 8192 1000 2196.38 2196.53 2196.47 16384 1000 2473.46 2473.94 2473.65 32768 1000 2972.51 2973.37 2973.17 65536 640 1852.62 1853.13 1852.91 131072 320 3378.70 3381.25 3379.96 262144 160 11884.24 11895.76 11892.11 524288 80 25283.34 25321.15 25305.90 1048576 40 54378.20 54526.70 54481.14 2097152 20 38145.90 38408.70 38264.85 4194304 10 116648.79 118880.80 117991.72 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 2 # ( 30 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.08 0.08 1 1000 0.87 0.87 0.87 2 1000 0.86 0.86 0.86 4 1000 0.87 0.87 0.87 8 1000 0.86 0.87 0.87 16 1000 0.88 0.88 0.88 32 1000 1.01 1.01 1.01 64 1000 0.89 0.89 0.89 128 1000 0.92 0.92 0.92 256 1000 1.05 1.05 1.05 512 1000 1.14 1.14 1.14 1024 1000 1.20 1.20 1.20 2048 1000 1.46 1.46 1.46 4096 1000 2.04 2.04 2.04 8192 1000 3.46 3.46 3.46 16384 1000 6.93 6.93 6.93 32768 1000 11.19 11.19 11.19 65536 640 38.40 38.41 38.41 131072 320 69.46 69.47 69.46 262144 160 82.36 82.38 82.37 524288 80 150.71 150.78 150.75 1048576 40 288.45 288.55 288.50 2097152 20 564.80 565.10 564.95 4194304 10 1908.99 1913.00 1911.00 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 4 # ( 28 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.08 0.08 1 1000 4.83 4.83 4.83 2 1000 4.50 4.50 4.50 4 1000 5.49 5.49 5.49 8 1000 4.59 4.59 4.59 16 1000 4.44 4.44 4.44 32 1000 4.88 4.88 4.88 64 1000 4.73 4.73 4.73 128 1000 4.36 4.37 4.37 256 1000 4.79 4.79 4.79 512 1000 5.17 5.17 5.17 1024 1000 4.86 4.86 4.86 2048 1000 4.73 4.73 4.73 4096 1000 7.61 7.61 7.61 8192 1000 10.51 10.51 10.51 16384 1000 17.72 17.72 17.72 32768 1000 46.73 46.74 46.73 65536 640 90.55 90.57 90.56 131072 320 121.40 121.41 121.41 262144 160 211.09 211.18 211.13 524288 80 383.75 383.85 383.79 1048576 40 1000.15 1001.30 1000.72 2097152 20 3146.99 3149.80 3148.32 4194304 10 6146.50 6151.70 6149.00 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 8 # ( 24 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.10 0.10 0.10 1 1000 12.42 12.42 12.42 2 1000 11.89 11.89 11.89 4 1000 11.96 11.97 11.97 8 1000 11.89 11.89 11.89 16 1000 11.78 11.79 11.78 32 1000 11.46 11.47 11.46 64 1000 11.76 11.77 11.76 128 1000 12.43 12.43 12.43 256 1000 12.58 12.58 12.58 512 1000 13.01 13.01 13.01 1024 1000 13.62 13.63 13.62 2048 1000 14.53 14.53 14.53 4096 1000 18.25 18.25 18.25 8192 1000 25.61 25.61 25.61 16384 1000 58.38 58.39 58.38 32768 1000 106.91 106.92 106.92 65536 640 175.12 175.13 175.12 131072 320 279.57 279.61 279.59 262144 160 637.93 638.93 638.60 524288 80 1796.61 1800.01 1798.27 1048576 40 4182.05 4188.85 4186.20 2097152 20 9557.31 9564.20 9561.16 4194304 10 19734.50 19768.50 19757.33 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 16 # ( 16 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.12 0.13 0.12 1 1000 35.22 35.35 35.25 2 1000 35.87 35.89 35.89 4 1000 36.52 36.54 36.53 8 1000 35.40 35.43 35.41 16 1000 34.25 34.36 34.27 32 1000 33.82 33.83 33.83 64 1000 34.06 34.08 34.07 128 1000 34.80 34.83 34.81 256 1000 35.00 35.02 35.01 512 1000 36.69 36.70 36.70 1024 1000 44.42 44.43 44.42 2048 1000 47.44 47.45 47.45 4096 1000 75.25 75.25 75.25 8192 1000 132.94 133.00 132.97 16384 1000 218.94 219.08 219.04 32768 1000 359.19 359.30 359.26 65536 640 595.67 595.87 595.80 131072 320 1054.84 1055.52 1055.24 262144 160 2142.48 2143.81 2143.08 524288 80 4090.53 4096.32 4093.21