Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Inspector XE 2013 (build 328075) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) VTune(TM) Amplifier XE 2013 (build 328102) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) Copyright (C) 2009-2013 Intel Corporation. All rights reserved. Intel(R) Advisor XE 2013 (build 316162) #--------------------------------------------------- # Intel (R) MPI Benchmark Suite V3.2.4, MPI-1 part #--------------------------------------------------- # Date : Sun Sep 21 08:54:15 2014 # Machine : x86_64 # System : Linux # Release : 2.6.32-279.el6.x86_64 # Version : #1 SMP Thu Jun 21 07:08:44 CDT 2012 # MPI Version : 2.2 # MPI Thread Environment: # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # IMB-MPI1 # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong # PingPing # Sendrecv # Exchange # Allreduce # Reduce # Reduce_scatter # Allgather # Allgatherv # Gather # Gatherv # Scatter # Scatterv # Alltoall # Alltoallv # Bcast # Barrier #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 0.35 0.00 1 1000 0.35 2.70 2 1000 0.36 5.28 4 1000 0.36 10.63 8 1000 0.37 20.71 16 1000 0.38 40.20 32 1000 0.43 71.63 64 1000 0.42 144.27 128 1000 0.47 262.50 256 1000 0.52 471.78 512 1000 0.59 830.49 1024 1000 2.21 441.69 2048 1000 5.87 332.76 4096 1000 10.68 365.67 8192 1000 14.64 533.73 16384 1000 18.43 848.01 32768 1000 20.74 1506.93 65536 640 248.06 251.95 131072 320 296.70 421.30 262144 160 318.62 784.64 524288 80 229.01 2183.34 1048576 40 600.59 1665.04 2097152 20 1090.57 1833.90 4194304 10 2063.16 1938.78 #--------------------------------------------------- # Benchmarking PingPing # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 0.56 0.00 1 1000 1.42 0.67 2 1000 0.62 3.09 4 1000 0.60 6.33 8 1000 0.61 12.51 16 1000 0.61 25.01 32 1000 0.60 50.77 64 1000 0.64 95.95 128 1000 0.65 187.55 256 1000 0.69 352.25 512 1000 0.76 646.67 1024 1000 0.87 1125.27 2048 1000 1.11 1753.42 4096 1000 1.57 2484.68 8192 1000 3.29 2378.11 16384 1000 10.98 1423.30 32768 1000 18.91 1652.84 65536 640 204.82 305.14 131072 320 261.65 477.74 262144 160 434.18 575.80 524288 80 454.11 1101.06 1048576 40 1182.03 846.01 2097152 20 2179.65 917.58 4194304 10 4221.11 947.62 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 0.51 0.51 0.51 0.00 1 1000 0.52 0.52 0.52 3.69 2 1000 0.53 0.53 0.53 7.25 4 1000 0.55 0.55 0.55 13.97 8 1000 0.53 0.53 0.53 28.79 16 1000 0.51 0.51 0.51 59.40 32 1000 0.56 0.56 0.56 109.40 64 1000 0.52 0.52 0.52 236.16 128 1000 0.57 0.57 0.57 432.07 256 1000 0.65 0.65 0.65 749.91 512 1000 0.67 0.67 0.67 1448.89 1024 1000 0.79 0.79 0.79 2485.44 2048 1000 1.04 1.04 1.04 3748.34 4096 1000 3.20 3.20 3.20 2441.36 8192 1000 2.35 2.35 2.35 6649.35 16384 1000 4.07 4.07 4.07 7674.45 32768 1000 13.15 13.15 13.15 4754.24 65536 640 260.25 260.26 260.25 480.29 131072 320 324.70 325.06 324.88 769.09 262144 160 426.17 426.43 426.30 1172.54 524288 80 454.95 455.95 455.45 2193.21 1048576 40 1182.90 1188.98 1185.94 1682.12 2097152 20 2181.60 2185.45 2183.53 1830.29 4194304 10 4270.91 4279.52 4275.21 1869.37 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 0.68 0.68 0.68 0.00 1 1000 0.57 0.58 0.58 3.31 2 1000 0.55 0.55 0.55 6.95 4 1000 0.59 0.59 0.59 12.99 8 1000 0.57 0.57 0.57 26.96 16 1000 0.57 0.57 0.57 53.65 32 1000 0.59 0.59 0.59 102.77 64 1000 0.54 0.54 0.54 226.45 128 1000 0.64 0.64 0.64 378.56 256 1000 0.64 0.64 0.64 759.36 512 1000 0.70 0.70 0.70 1395.10 1024 1000 0.83 0.83 0.83 2358.77 2048 1000 4.22 4.22 4.22 925.86 4096 1000 1.49 1.49 1.49 5232.83 8192 1000 2.47 2.47 2.47 6328.92 16384 1000 10.90 10.90 10.90 2866.21 32768 1000 14.31 14.31 14.31 4367.54 65536 640 158.18 158.42 158.30 789.03 131072 320 155.45 155.88 155.66 1603.82 262144 160 156.89 157.41 157.08 3176.51 524288 80 446.29 447.56 446.92 2234.33 1048576 40 1455.87 1485.75 1474.70 1346.12 2097152 20 3162.40 3277.85 3219.21 1220.31 4194304 10 5734.30 5914.09 5819.88 1352.70 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 0.90 0.90 0.90 0.00 1 1000 0.63 0.63 0.63 3.04 2 1000 0.66 0.66 0.66 5.81 4 1000 4.52 4.52 4.52 1.69 8 1000 0.63 0.63 0.63 24.07 16 1000 0.61 0.61 0.61 50.20 32 1000 0.63 0.63 0.63 96.71 64 1000 0.67 0.67 0.67 180.85 128 1000 0.70 0.70 0.70 346.77 256 1000 0.71 0.72 0.71 681.98 512 1000 0.85 0.85 0.85 1142.22 1024 1000 0.91 0.91 0.91 2143.94 2048 1000 1.87 1.87 1.87 2086.60 4096 1000 1.57 1.57 1.57 4969.37 8192 1000 2.79 2.79 2.79 5592.29 16384 1000 10.06 10.06 10.06 3107.00 32768 1000 20.10 20.12 20.11 3106.34 65536 640 160.66 160.91 160.77 776.85 131072 320 161.43 161.93 161.66 1543.84 262144 160 362.66 364.99 363.93 1369.91 524288 80 449.23 452.59 450.59 2209.52 1048576 40 1483.53 1499.30 1489.58 1333.95 2097152 20 3081.20 3239.60 3164.50 1234.72 4194304 10 6435.49 6892.80 6655.19 1160.63 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 6.74 6.76 6.75 0.00 1 1000 1.32 1.33 1.33 1.43 2 1000 1.34 1.36 1.35 2.81 4 1000 1.32 1.33 1.32 5.74 8 1000 1.32 1.32 1.32 11.54 16 1000 1.31 1.33 1.32 23.03 32 1000 1.33 1.34 1.33 45.51 64 1000 6.48 6.49 6.49 18.79 128 1000 1.34 1.35 1.34 181.50 256 1000 1.33 1.34 1.34 364.09 512 1000 6.22 6.23 6.22 156.78 1024 1000 9.97 9.99 9.98 195.53 2048 1000 2.38 2.39 2.39 1631.71 4096 1000 10.41 10.43 10.42 748.97 8192 1000 12.87 12.91 12.89 1210.67 16384 1000 19.11 19.17 19.14 1630.49 32768 1000 34.92 35.01 34.97 1785.10 65536 640 108.97 109.74 109.35 1139.06 131072 320 154.66 156.61 155.82 1596.33 262144 160 357.04 369.27 363.43 1354.01 524288 80 500.70 528.30 513.05 1892.86 1048576 40 1630.40 1804.88 1726.61 1108.11 2097152 20 2792.75 3413.15 3143.53 1171.94 4194304 10 4704.28 7007.88 6198.28 1141.57 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 6.48 6.97 6.66 0.00 1 1000 7.21 7.39 7.27 0.26 2 1000 7.18 7.44 7.28 0.51 4 1000 6.94 7.28 7.05 1.05 8 1000 7.18 7.46 7.29 2.05 16 1000 6.97 7.16 7.06 4.26 32 1000 7.08 7.33 7.18 8.32 64 1000 7.32 7.60 7.45 16.06 128 1000 9.31 9.86 9.56 24.77 256 1000 9.20 9.69 9.34 50.39 512 1000 8.06 8.38 8.21 116.49 1024 1000 8.74 9.04 8.84 216.10 2048 1000 9.82 10.14 9.92 385.38 4096 1000 11.16 11.64 11.38 670.95 8192 1000 14.67 15.01 14.81 1041.18 16384 1000 22.19 22.70 22.30 1376.89 32768 1000 36.29 37.32 36.95 1674.80 65536 640 257.96 262.75 260.95 475.73 131072 320 315.91 324.57 321.49 770.25 262144 160 432.41 457.84 448.07 1092.07 524288 80 783.41 864.40 840.55 1156.87 1048576 40 1917.87 2511.60 2295.17 796.30 2097152 20 3003.86 5071.39 4419.01 788.74 4194304 10 3619.60 9709.29 7443.26 823.95 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 8.37 8.94 8.63 0.00 1 1000 7.26 7.60 7.44 0.25 2 1000 7.31 7.61 7.44 0.50 4 1000 8.00 8.64 8.41 0.88 8 1000 7.20 7.81 7.46 1.95 16 1000 7.03 7.54 7.34 4.05 32 1000 7.50 8.15 7.81 7.49 64 1000 7.71 8.05 7.85 15.16 128 1000 7.43 7.98 7.64 30.61 256 1000 8.24 8.78 8.46 55.59 512 1000 8.32 8.92 8.70 109.48 1024 1000 8.16 8.68 8.35 225.01 2048 1000 9.98 10.71 10.26 364.63 4096 1000 12.09 12.98 12.41 601.80 8192 1000 16.85 18.11 17.54 862.87 16384 1000 22.11 22.71 22.28 1376.04 32768 1000 40.17 41.36 40.94 1511.15 65536 640 393.37 400.56 397.70 312.06 131072 320 412.30 429.11 423.21 582.61 262144 160 464.46 506.11 486.89 987.94 524288 80 786.76 877.45 837.65 1139.66 1048576 40 1946.05 2427.05 2262.59 824.04 2097152 20 3175.75 4867.85 4195.83 821.72 4194304 10 3385.59 10091.40 7475.44 792.75 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 9.40 10.28 9.83 0.00 1 1000 7.34 7.98 7.60 0.24 2 1000 8.91 9.69 9.20 0.39 4 1000 8.39 8.98 8.72 0.85 8 1000 8.51 9.63 9.18 1.58 16 1000 8.86 9.87 9.36 3.09 32 1000 9.19 10.45 9.86 5.84 64 1000 8.74 9.51 9.18 12.83 128 1000 8.81 9.95 9.23 24.54 256 1000 8.58 9.59 8.97 50.93 512 1000 9.50 10.50 9.97 92.98 1024 1000 10.45 11.74 11.10 166.42 2048 1000 10.24 10.82 10.52 360.99 4096 1000 13.39 14.25 13.84 548.12 8192 1000 15.51 16.78 16.17 931.07 16384 1000 34.38 35.78 35.12 873.47 32768 1000 52.15 55.69 53.83 1122.24 65536 640 347.26 366.03 357.97 341.51 131072 320 376.07 404.49 394.05 618.07 262144 160 410.63 468.97 440.44 1066.17 524288 80 776.14 1141.46 998.43 876.07 1048576 40 1862.70 3110.45 2589.90 642.99 2097152 20 2948.95 6901.25 5140.83 579.60 4194304 10 3470.71 13675.69 8980.84 584.98 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 9.10 10.62 9.86 0.00 1 1000 8.78 10.37 9.68 0.18 2 1000 7.89 9.88 8.75 0.39 4 1000 8.03 8.99 8.50 0.85 8 1000 9.05 10.53 9.93 1.45 16 1000 8.09 9.43 8.76 3.24 32 1000 9.12 10.47 9.83 5.83 64 1000 8.86 10.41 9.65 11.73 128 1000 9.03 10.89 10.06 22.42 256 1000 9.38 11.48 10.25 42.53 512 1000 8.29 9.35 8.75 104.41 1024 1000 10.33 11.68 10.75 167.18 2048 1000 11.60 14.34 13.14 272.46 4096 1000 11.83 13.43 12.66 581.73 8192 1000 16.12 17.41 16.73 897.26 16384 1000 37.05 39.67 38.42 787.65 32768 1000 58.90 65.85 63.56 949.08 65536 640 392.11 416.60 405.46 300.05 131072 320 401.65 449.47 425.44 556.20 262144 160 393.78 471.28 437.55 1060.94 524288 80 760.78 1129.50 1023.57 885.35 1048576 40 1923.30 3214.90 2770.99 622.10 2097152 20 2919.05 7031.55 5600.81 568.86 4194304 10 3566.19 14006.40 9871.18 571.17 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 384 #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 7.22 9.75 8.35 0.00 1 1000 7.76 9.75 8.86 0.20 2 1000 8.27 10.55 9.47 0.36 4 1000 7.82 10.03 8.76 0.76 8 1000 8.00 10.41 9.08 1.47 16 1000 7.82 10.03 8.84 3.04 32 1000 7.92 9.23 8.56 6.62 64 1000 8.04 10.15 9.04 12.02 128 1000 8.74 10.91 9.56 22.38 256 1000 7.57 9.77 8.38 49.98 512 1000 8.77 10.50 9.66 92.98 1024 1000 8.24 9.84 9.01 198.57 2048 1000 9.23 10.58 9.95 369.11 4096 1000 11.62 14.56 12.97 536.54 8192 1000 18.84 22.23 20.32 702.87 16384 1000 42.22 47.90 44.87 652.44 32768 1000 864.96 1387.11 1126.26 45.06 65536 640 359.23 421.69 395.85 296.42 131072 320 336.59 471.83 417.61 529.85 262144 160 333.99 474.16 412.23 1054.49 524288 80 565.26 1128.04 900.57 886.49 1048576 40 1643.35 3215.68 2505.19 621.95 2097152 20 2702.94 6964.70 5102.68 574.33 4194304 10 2816.80 13872.10 9134.34 576.70 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 1.15 1.15 1.15 0.00 1 1000 1.26 1.26 1.26 3.03 2 1000 1.30 1.30 1.30 5.87 4 1000 1.25 1.25 1.25 12.25 8 1000 1.25 1.25 1.25 24.49 16 1000 3.51 3.51 3.51 17.37 32 1000 1.26 1.26 1.26 97.04 64 1000 1.25 1.25 1.25 194.86 128 1000 1.37 1.37 1.37 355.37 256 1000 1.41 1.41 1.41 691.66 512 1000 1.53 1.53 1.53 1272.44 1024 1000 1.81 1.81 1.81 2153.52 2048 1000 2.22 2.22 2.22 3517.77 4096 1000 3.02 3.02 3.02 5180.71 8192 1000 11.41 11.41 11.41 2738.08 16384 1000 12.38 12.38 12.38 5048.51 32768 1000 29.57 29.57 29.57 4226.83 65536 640 280.17 280.22 280.20 892.15 131072 320 335.35 335.37 335.36 1490.88 262144 160 508.24 508.39 508.32 1967.00 524288 80 754.46 755.42 754.94 2647.52 1048576 40 2281.43 2283.63 2282.53 1751.60 2097152 20 4283.81 4284.91 4284.36 1867.02 4194304 10 8401.89 8415.08 8408.49 1901.35 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 1.18 1.18 1.18 0.00 1 1000 1.17 1.17 1.17 3.25 2 1000 1.17 1.17 1.17 6.50 4 1000 3.69 3.69 3.69 4.13 8 1000 1.17 1.17 1.17 26.06 16 1000 2.92 2.92 2.92 20.88 32 1000 1.18 1.18 1.18 103.10 64 1000 1.19 1.19 1.19 205.87 128 1000 1.25 1.25 1.25 390.62 256 1000 1.39 1.39 1.39 701.97 512 1000 1.49 1.50 1.49 1306.33 1024 1000 1.77 1.77 1.77 2209.28 2048 1000 2.44 2.44 2.44 3200.63 4096 1000 9.47 9.47 9.47 1650.28 8192 1000 10.63 10.63 10.63 2938.97 16384 1000 18.12 18.13 18.12 3448.08 32768 1000 30.27 30.28 30.27 4128.25 65536 640 385.44 385.68 385.56 648.21 131072 320 546.54 546.82 546.68 914.38 262144 160 713.79 716.07 715.07 1396.51 524288 80 877.70 882.61 880.16 2266.00 1048576 40 2799.98 2813.35 2806.56 1421.79 2097152 20 4893.74 5012.39 4965.51 1596.04 4194304 10 10370.68 10391.90 10382.32 1539.66 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 1.40 1.41 1.40 0.00 1 1000 1.24 1.25 1.24 3.06 2 1000 1.25 1.26 1.26 6.07 4 1000 4.67 4.67 4.67 3.27 8 1000 1.24 1.25 1.25 24.49 16 1000 1.26 1.26 1.26 48.44 32 1000 1.24 1.24 1.24 98.44 64 1000 1.23 1.23 1.23 198.49 128 1000 1.35 1.35 1.35 362.22 256 1000 1.44 1.44 1.44 676.24 512 1000 1.59 1.59 1.59 1230.03 1024 1000 1.83 1.84 1.83 2128.62 2048 1000 2.48 2.49 2.49 3142.61 4096 1000 10.00 10.00 10.00 1562.02 8192 1000 11.22 11.22 11.22 2784.74 16384 1000 19.00 19.00 19.00 3289.84 32768 1000 32.30 32.31 32.30 3869.37 65536 640 309.42 309.84 309.58 806.86 131072 320 542.41 546.15 544.62 915.50 262144 160 711.64 715.79 714.30 1397.06 524288 80 890.66 894.04 892.45 2237.04 1048576 40 3424.63 3503.38 3469.33 1141.76 2097152 20 6618.75 6839.30 6757.97 1169.71 4194304 10 14331.51 15768.10 15137.94 1014.71 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 2.00 2.01 2.00 0.00 1 1000 2.20 2.21 2.21 1.73 2 1000 2.19 2.20 2.19 3.47 4 1000 2.17 2.18 2.17 7.00 8 1000 2.16 2.17 2.17 14.04 16 1000 2.17 2.18 2.17 28.01 32 1000 2.17 2.18 2.18 55.97 64 1000 2.17 2.18 2.17 112.05 128 1000 2.44 2.45 2.45 199.22 256 1000 2.47 2.47 2.47 394.61 512 1000 2.96 2.97 2.97 657.36 1024 1000 10.21 10.22 10.22 382.14 2048 1000 10.70 10.71 10.71 729.18 4096 1000 12.03 12.06 12.04 1296.02 8192 1000 20.79 20.82 20.81 1500.90 16384 1000 31.74 31.79 31.76 1966.08 32768 1000 57.13 57.24 57.19 2183.71 65536 640 292.11 293.79 293.08 850.95 131072 320 485.03 491.25 488.24 1017.82 262144 160 671.76 686.85 680.40 1455.93 524288 80 1003.61 1044.16 1031.13 1915.41 1048576 40 3325.80 3562.36 3476.91 1122.85 2097152 20 6534.24 7669.20 7154.42 1043.13 4194304 10 11156.01 14889.91 13225.52 1074.55 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 71.62 72.39 72.13 0.00 1 1000 70.60 71.65 71.27 0.05 2 1000 69.04 70.28 69.77 0.11 4 1000 70.66 71.23 70.95 0.21 8 1000 70.10 70.94 70.55 0.43 16 1000 70.62 71.46 71.16 0.85 32 1000 70.14 70.66 70.43 1.73 64 1000 69.57 70.25 69.95 3.48 128 1000 70.63 71.77 71.28 6.80 256 1000 72.04 72.97 72.53 13.38 512 1000 69.96 70.98 70.41 27.52 1024 1000 70.17 70.91 70.55 55.09 2048 1000 73.78 74.39 74.10 105.02 4096 1000 71.46 71.83 71.60 217.53 8192 1000 76.44 77.07 76.86 405.46 16384 1000 82.38 83.42 82.98 749.19 32768 1000 108.93 109.90 109.43 1137.44 65536 640 745.12 750.74 748.54 333.00 131072 320 874.99 888.19 884.04 562.94 262144 160 983.44 1017.31 1006.28 982.99 524288 80 2186.48 2329.63 2273.67 858.51 1048576 40 5032.48 5681.45 5430.41 704.05 2097152 20 8351.95 10965.20 10003.74 729.58 4194304 10 9889.20 20002.51 16727.77 799.90 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 18.88 19.27 19.06 0.00 1 1000 21.81 22.56 22.19 0.17 2 1000 17.05 17.27 17.20 0.44 4 1000 15.22 15.95 15.56 0.96 8 1000 20.33 20.82 20.55 1.47 16 1000 18.36 18.89 18.53 3.23 32 1000 16.70 17.30 16.97 7.06 64 1000 20.77 21.13 21.02 11.55 128 1000 24.20 24.77 24.40 19.71 256 1000 18.74 19.36 19.19 50.44 512 1000 19.43 19.88 19.67 98.27 1024 1000 23.61 23.97 23.78 162.95 2048 1000 26.89 27.41 27.12 285.07 4096 1000 24.15 24.61 24.46 634.95 8192 1000 32.79 33.37 33.16 936.36 16384 1000 46.78 47.88 47.35 1305.37 32768 1000 83.41 85.10 84.51 1468.84 65536 640 822.91 836.22 832.33 298.96 131072 320 906.88 926.34 919.28 539.76 262144 160 1168.67 1222.22 1204.08 818.18 524288 80 2234.79 2414.54 2345.34 828.32 1048576 40 4922.60 5682.25 5399.35 703.95 2097152 20 7729.76 11192.60 9866.35 714.76 4194304 10 9560.49 19393.71 16024.36 825.01 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 19.56 20.17 19.95 0.00 1 1000 21.37 21.88 21.69 0.17 2 1000 24.38 24.99 24.67 0.31 4 1000 23.32 24.73 23.85 0.62 8 1000 24.06 24.81 24.40 1.23 16 1000 22.47 23.03 22.75 2.65 32 1000 20.80 22.77 21.50 5.36 64 1000 27.71 29.86 28.37 8.18 128 1000 21.58 22.41 22.09 21.79 256 1000 26.48 27.04 26.78 36.11 512 1000 22.05 22.74 22.43 85.90 1024 1000 25.40 26.34 25.83 148.28 2048 1000 27.53 28.09 27.84 278.15 4096 1000 28.82 29.93 29.54 522.05 8192 1000 38.25 38.99 38.63 801.55 16384 1000 58.58 59.53 59.09 1049.96 32768 1000 86.35 88.75 88.24 1408.37 65536 640 840.86 869.84 859.53 287.41 131072 320 956.22 985.54 974.16 507.33 262144 160 1111.76 1259.22 1198.71 794.14 524288 80 2307.44 2938.07 2644.05 680.72 1048576 40 5256.47 7405.67 6496.68 540.13 2097152 20 7923.31 14958.89 12099.34 534.80 4194304 10 10237.81 28291.99 19980.88 565.53 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 21.92 23.24 22.62 0.00 1 1000 23.73 24.95 24.44 0.15 2 1000 21.10 21.98 21.49 0.35 4 1000 21.79 23.07 22.29 0.66 8 1000 23.55 24.45 24.05 1.25 16 1000 22.56 23.76 23.25 2.57 32 1000 24.56 26.20 25.34 4.66 64 1000 25.17 26.54 25.68 9.20 128 1000 23.86 25.71 24.65 19.00 256 1000 24.83 25.77 25.28 37.89 512 1000 26.32 27.57 27.12 70.84 1024 1000 26.57 28.21 27.43 138.46 2048 1000 27.32 28.35 27.83 275.54 4096 1000 30.14 31.44 30.86 496.91 8192 1000 37.95 39.70 38.82 787.11 16384 1000 57.39 59.03 58.23 1058.86 32768 1000 84.64 87.66 86.22 1425.90 65536 640 823.47 850.70 837.70 293.88 131072 320 940.26 1033.12 989.14 483.97 262144 160 1029.48 1268.03 1140.63 788.62 524288 80 2495.48 3147.28 2883.28 635.47 1048576 40 4498.30 7585.90 6602.02 527.29 2097152 20 8045.04 14542.35 12183.68 550.12 4194304 10 10099.51 27830.31 21496.86 574.91 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 384 #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 20.11 22.45 21.30 0.00 1 1000 22.51 23.64 22.92 0.16 2 1000 21.48 23.43 22.62 0.33 4 1000 22.17 23.22 22.73 0.66 8 1000 22.44 24.10 23.27 1.27 16 1000 21.64 24.01 22.72 2.54 32 1000 22.72 24.04 23.45 5.08 64 1000 22.60 24.45 23.35 9.98 128 1000 22.11 24.82 23.57 19.68 256 1000 23.03 24.69 23.97 39.56 512 1000 24.43 26.31 25.09 74.23 1024 1000 24.80 26.90 25.72 145.21 2048 1000 25.44 27.67 26.48 282.39 4096 1000 28.29 31.33 29.87 498.76 8192 1000 36.43 39.26 37.97 795.90 16384 1000 52.12 56.56 54.40 1105.10 32768 1000 793.17 1054.41 923.49 118.55 65536 640 762.19 817.90 799.56 305.66 131072 320 823.21 942.55 895.13 530.48 262144 160 907.42 1150.81 1061.53 868.95 524288 80 1914.80 2959.44 2551.03 675.80 1048576 40 3588.50 7427.00 5961.83 538.58 2097152 20 7064.35 14906.75 11123.29 536.67 4194304 10 9388.52 27808.00 19983.64 575.37 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 0.88 0.88 0.88 8 1000 0.85 0.85 0.85 16 1000 0.86 0.86 0.86 32 1000 0.86 0.86 0.86 64 1000 0.89 0.89 0.89 128 1000 2.28 2.28 2.28 256 1000 1.03 1.03 1.03 512 1000 1.09 1.09 1.09 1024 1000 1.27 1.27 1.27 2048 1000 3.72 3.72 3.72 4096 1000 8.32 8.32 8.32 8192 1000 4.18 4.18 4.18 16384 1000 11.31 11.31 11.31 32768 1000 22.01 22.01 22.01 65536 640 43.00 43.00 43.00 131072 320 610.49 610.77 610.63 262144 160 773.91 774.91 774.41 524288 80 540.10 540.16 540.13 1048576 40 1595.45 1598.30 1596.87 2097152 20 2774.66 2778.10 2776.38 4194304 10 5785.89 5802.49 5794.19 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 1.40 1.40 1.40 8 1000 1.40 1.40 1.40 16 1000 3.13 3.13 3.13 32 1000 1.39 1.39 1.39 64 1000 1.41 1.41 1.41 128 1000 1.52 1.53 1.53 256 1000 1.66 1.66 1.66 512 1000 1.83 1.83 1.83 1024 1000 2.19 2.19 2.19 2048 1000 2.86 2.86 2.86 4096 1000 11.98 11.98 11.98 8192 1000 19.83 19.83 19.83 16384 1000 20.85 20.85 20.85 32768 1000 39.34 39.34 39.34 65536 640 77.14 77.15 77.14 131072 320 884.90 885.60 885.28 262144 160 1495.62 1496.54 1496.08 524288 80 1045.57 1047.49 1046.49 1048576 40 2274.55 2279.90 2276.81 2097152 20 4715.75 4753.80 4743.38 4194304 10 10488.20 10573.89 10548.14 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 2.03 2.03 2.03 8 1000 1.95 1.95 1.95 16 1000 1.98 1.98 1.98 32 1000 1.96 1.96 1.96 64 1000 2.02 2.02 2.02 128 1000 2.17 2.17 2.17 256 1000 2.35 2.35 2.35 512 1000 2.64 2.64 2.64 1024 1000 3.13 3.13 3.13 2048 1000 10.27 10.27 10.27 4096 1000 12.47 12.47 12.47 8192 1000 19.23 19.23 19.23 16384 1000 23.61 23.61 23.61 32768 1000 47.30 47.30 47.30 65536 640 91.00 91.00 91.00 131072 320 1074.37 1075.09 1074.66 262144 160 1850.68 1856.10 1853.21 524288 80 1080.64 1089.84 1086.98 1048576 40 2571.70 2616.07 2599.19 2097152 20 7426.64 7449.65 7446.22 4194304 10 15254.59 15464.19 15409.92 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 4 1000 11.63 11.63 11.63 8 1000 11.51 11.51 11.51 16 1000 11.23 11.24 11.24 32 1000 11.17 11.18 11.18 64 1000 12.43 12.43 12.43 128 1000 10.86 10.86 10.86 256 1000 17.58 17.58 17.58 512 1000 10.87 10.87 10.87 1024 1000 18.36 18.36 18.36 2048 1000 21.49 21.50 21.50 4096 1000 30.20 30.21 30.20 8192 1000 42.60 42.60 42.60 16384 1000 59.91 59.91 59.91 32768 1000 90.60 90.60 90.60 65536 640 162.98 163.00 162.99 131072 320 1160.86 1162.66 1161.74 262144 160 2003.57 2010.33 2005.95 524288 80 1594.28 1607.20 1602.05 1048576 40 3276.55 3309.38 3297.67 2097152 20 8565.04 8634.85 8619.15 4194304 10 18075.70 18296.60 18235.47 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 4 1000 156.21 156.39 156.31 8 1000 148.32 148.57 148.45 16 1000 156.84 156.93 156.88 32 1000 152.55 152.80 152.68 64 1000 156.47 156.56 156.50 128 1000 155.92 156.00 155.95 256 1000 161.79 162.13 162.03 512 1000 146.63 146.98 146.85 1024 1000 167.97 168.13 167.99 2048 1000 289.62 289.70 289.65 4096 1000 276.43 276.66 276.47 8192 1000 271.57 271.58 271.58 16384 1000 250.96 250.97 250.97 32768 1000 282.29 282.45 282.32 65536 640 365.13 365.40 365.19 131072 320 5178.68 5180.77 5179.35 262144 160 6815.43 6820.47 6817.11 524288 80 6923.04 6938.62 6927.53 1048576 40 8518.42 8564.80 8534.49 2097152 20 15860.90 16144.90 15952.58 4194304 10 27630.69 28035.71 27777.48 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.09 0.06 4 1000 175.32 175.36 175.33 8 1000 180.78 180.89 180.86 16 1000 174.90 175.29 175.14 32 1000 179.93 180.12 179.98 64 1000 172.21 172.47 172.32 128 1000 182.84 183.04 182.94 256 1000 206.25 206.55 206.38 512 1000 550.00 550.33 550.13 1024 1000 563.28 563.66 563.44 2048 1000 499.21 499.30 499.27 4096 1000 510.94 511.21 511.06 8192 1000 501.18 501.40 501.29 16384 1000 508.69 508.91 508.76 32768 1000 543.99 544.30 544.18 65536 640 586.08 586.50 586.31 131072 320 617.78 618.56 618.24 262144 160 1012.64 1016.09 1014.78 524288 80 2298.24 2307.35 2304.46 1048576 40 4927.15 4957.00 4945.85 2097152 20 9730.96 9874.51 9805.80 4194304 10 18243.91 18726.49 18473.82 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 4 1000 392.90 393.13 393.02 8 1000 410.39 410.86 410.68 16 1000 414.46 414.78 414.63 32 1000 414.26 414.41 414.34 64 1000 416.71 417.02 416.84 128 1000 413.17 413.36 413.25 256 1000 417.40 417.63 417.48 512 1000 617.65 618.13 617.85 1024 1000 435.87 436.30 436.09 2048 1000 437.40 437.61 437.52 4096 1000 682.67 682.94 682.90 8192 1000 682.07 682.53 682.32 16384 1000 699.78 700.23 700.01 32768 1000 727.96 728.33 728.09 65536 640 758.33 758.88 758.69 131072 320 7290.98 7305.13 7292.88 262144 160 1378.81 1382.18 1380.15 524288 80 3692.06 3709.25 3701.16 1048576 40 7430.33 7494.67 7460.21 2097152 20 16025.30 16283.25 16155.84 4194304 10 29110.10 30600.09 29690.12 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 4 1000 476.22 476.48 476.38 8 1000 489.58 489.98 489.79 16 1000 499.87 500.06 499.96 32 1000 501.73 502.11 501.94 64 1000 496.15 496.46 496.28 128 1000 503.64 503.91 503.78 256 1000 507.04 507.42 507.29 512 1000 819.80 820.31 820.05 1024 1000 509.19 509.44 509.30 2048 1000 523.80 524.13 523.96 4096 1000 831.45 832.03 831.75 8192 1000 843.15 843.74 843.45 16384 1000 844.86 845.41 845.14 32768 1000 872.96 873.47 873.19 65536 640 914.91 915.75 915.38 131072 320 8270.86 8279.73 8272.11 262144 160 1686.54 1690.37 1688.32 524288 80 4207.02 4221.04 4216.10 1048576 40 8263.70 8333.45 8301.71 2097152 20 16987.00 17278.24 17154.18 4194304 10 30187.11 31616.71 30977.57 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.07 0.06 4 1000 487.44 487.73 487.57 8 1000 495.05 495.34 495.16 16 1000 504.99 505.35 505.13 32 1000 502.54 503.00 502.78 64 1000 504.56 504.82 504.65 128 1000 503.37 503.74 503.53 256 1000 515.69 515.96 515.78 512 1000 865.50 866.24 865.82 1024 1000 526.24 526.55 526.37 2048 1000 525.68 526.00 525.81 4096 1000 739.53 740.19 739.93 8192 1000 751.43 751.86 751.65 16384 1000 791.71 792.17 792.00 32768 1000 869.89 870.51 870.20 65536 640 2473.20 2474.29 2473.69 131072 320 6432.59 6449.15 6434.83 262144 160 1893.22 1899.74 1896.82 524288 80 4524.59 4545.10 4535.93 1048576 40 8775.48 8845.60 8807.61 2097152 20 17710.45 17985.45 17840.84 4194304 10 31447.60 33096.89 32073.82 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 0.53 0.53 0.53 8 1000 0.55 0.55 0.55 16 1000 0.53 0.53 0.53 32 1000 0.58 0.58 0.58 64 1000 0.60 0.60 0.60 128 1000 0.66 0.66 0.66 256 1000 0.73 0.73 0.73 512 1000 0.80 0.80 0.80 1024 1000 0.95 0.95 0.95 2048 1000 1.23 1.23 1.23 4096 1000 1.77 1.78 1.77 8192 1000 2.77 2.77 2.77 16384 1000 10.26 10.26 10.26 32768 1000 19.70 19.70 19.70 65536 640 30.31 30.32 30.31 131072 320 61.22 61.26 61.24 262144 160 122.25 122.40 122.32 524288 80 243.11 245.65 244.38 1048576 40 425.45 433.12 429.28 2097152 20 897.40 917.15 907.27 4194304 10 1973.51 2020.81 1997.16 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.08 0.07 4 1000 1.06 1.06 1.06 8 1000 1.08 1.08 1.08 16 1000 1.07 1.07 1.07 32 1000 1.18 1.18 1.18 64 1000 1.17 1.17 1.17 128 1000 1.33 1.34 1.33 256 1000 3.29 3.29 3.29 512 1000 1.63 1.63 1.63 1024 1000 1.98 1.98 1.98 2048 1000 2.54 2.55 2.55 4096 1000 10.22 10.23 10.23 8192 1000 15.43 15.44 15.44 16384 1000 20.22 20.24 20.23 32768 1000 28.80 28.81 28.81 65536 640 60.70 60.73 60.72 131072 320 95.71 96.44 96.00 262144 160 171.78 174.13 173.51 524288 80 301.49 308.42 304.09 1048576 40 601.45 624.40 615.84 2097152 20 1376.65 1464.75 1442.54 4194304 10 2748.70 3286.50 3151.99 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 1.11 1.11 1.11 8 1000 1.14 1.14 1.14 16 1000 1.14 1.14 1.14 32 1000 1.18 1.18 1.18 64 1000 1.20 1.21 1.21 128 1000 1.38 1.38 1.38 256 1000 1.48 1.49 1.48 512 1000 1.69 1.70 1.69 1024 1000 2.02 2.03 2.03 2048 1000 17.04 17.05 17.05 4096 1000 21.89 22.05 21.99 8192 1000 15.20 15.27 15.24 16384 1000 29.59 29.98 29.77 32768 1000 41.43 41.67 41.51 65536 640 58.48 58.88 58.69 131072 320 108.88 110.83 109.81 262144 160 204.75 209.18 207.72 524288 80 388.11 406.94 397.45 1048576 40 851.35 940.50 891.51 2097152 20 2071.64 2312.90 2202.52 4194304 10 4222.39 4314.59 4302.01 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.08 0.07 4 1000 1.77 1.78 1.78 8 1000 1.79 1.80 1.79 16 1000 1.78 1.79 1.79 32 1000 2.08 2.09 2.09 64 1000 3.98 4.15 4.11 128 1000 2.29 2.30 2.30 256 1000 2.32 2.34 2.33 512 1000 2.76 2.77 2.76 1024 1000 10.38 10.40 10.39 2048 1000 13.59 13.61 13.60 4096 1000 5.96 5.99 5.97 8192 1000 22.11 22.51 22.22 16384 1000 32.38 32.78 32.49 32768 1000 49.94 50.50 50.10 65536 640 85.72 87.17 86.53 131072 320 161.12 162.75 162.05 262144 160 305.96 315.78 311.94 524288 80 582.64 644.29 622.43 1048576 40 1287.32 1380.65 1305.28 2097152 20 2688.85 3088.00 3036.69 4194304 10 4794.81 6238.20 5837.06 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 4 1000 7.70 8.27 7.84 8 1000 7.32 7.90 7.47 16 1000 7.33 7.63 7.41 32 1000 7.60 7.93 7.66 64 1000 7.60 8.01 7.73 128 1000 9.44 9.86 9.58 256 1000 8.13 8.35 8.19 512 1000 8.70 9.11 8.83 1024 1000 9.25 9.54 9.35 2048 1000 10.63 11.15 10.77 4096 1000 13.44 14.05 13.63 8192 1000 17.13 17.73 17.35 16384 1000 27.96 28.45 28.09 32768 1000 47.79 48.64 48.03 65536 640 89.88 91.53 91.07 131072 320 171.12 180.62 178.45 262144 160 352.97 367.22 363.72 524288 80 666.67 753.06 710.17 1048576 40 1472.97 1640.33 1574.55 2097152 20 2410.20 3647.90 3388.96 4194304 10 4879.59 6829.29 5873.92 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.08 0.07 4 1000 9.20 9.56 9.38 8 1000 8.75 9.27 8.96 16 1000 8.45 9.00 8.64 32 1000 8.45 9.09 8.72 64 1000 9.82 10.40 10.12 128 1000 10.29 10.89 10.47 256 1000 9.03 9.65 9.31 512 1000 9.33 9.88 9.59 1024 1000 11.33 12.00 11.62 2048 1000 11.92 12.70 12.23 4096 1000 14.20 14.87 14.47 8192 1000 18.86 19.83 19.28 16384 1000 34.44 35.65 34.95 32768 1000 60.28 61.90 61.09 65536 640 113.97 118.94 117.12 131072 320 192.60 209.37 203.55 262144 160 409.82 469.68 451.80 524288 80 776.09 941.50 885.92 1048576 40 1594.38 1963.13 1797.93 2097152 20 2303.15 4466.81 3918.17 4194304 10 4908.99 8165.50 6923.54 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.08 0.07 4 1000 11.44 12.74 12.12 8 1000 10.27 11.30 10.77 16 1000 11.10 12.79 11.88 32 1000 11.01 12.24 11.60 64 1000 10.75 12.26 11.38 128 1000 13.54 15.22 14.38 256 1000 11.97 13.34 12.78 512 1000 13.48 15.38 14.41 1024 1000 14.62 16.03 15.36 2048 1000 16.95 18.68 17.80 4096 1000 21.61 23.63 22.61 8192 1000 31.90 34.26 33.13 16384 1000 63.83 67.55 65.90 32768 1000 116.10 123.10 120.76 65536 640 196.71 209.87 204.15 131072 320 316.54 385.96 369.53 262144 160 496.14 678.49 620.95 524288 80 920.17 1286.91 1159.35 1048576 40 1680.93 3111.37 2625.03 2097152 20 2337.40 6759.80 5319.95 4194304 10 5039.12 13686.80 10305.31 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.07 4 1000 15.19 18.89 17.20 8 1000 16.11 19.83 17.98 16 1000 15.47 18.98 17.47 32 1000 15.95 19.18 17.62 64 1000 16.00 20.11 17.89 128 1000 18.13 21.59 20.01 256 1000 18.32 21.87 19.95 512 1000 19.73 23.27 21.84 1024 1000 22.69 26.70 24.89 2048 1000 26.03 30.71 28.66 4096 1000 34.39 40.72 37.81 8192 1000 52.42 60.93 57.42 16384 1000 105.52 113.83 109.96 32768 1000 154.62 175.79 167.77 65536 640 265.60 325.01 303.90 131072 320 355.60 500.55 448.29 262144 160 515.47 823.37 723.08 524288 80 929.47 1698.03 1425.70 1048576 40 1653.72 4018.35 3226.80 2097152 20 2323.54 8604.40 6907.01 4194304 10 4991.29 17003.49 13277.12 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 4 1000 15.74 21.01 18.37 8 1000 17.01 22.34 19.88 16 1000 16.69 21.60 19.44 32 1000 18.04 23.92 20.99 64 1000 18.15 22.78 20.50 128 1000 19.24 25.07 22.33 256 1000 20.03 25.70 23.14 512 1000 23.36 29.53 26.74 1024 1000 27.84 34.85 31.72 2048 1000 31.80 40.55 36.70 4096 1000 39.76 49.92 45.92 8192 1000 55.35 70.09 64.32 16384 1000 107.37 130.92 120.14 32768 1000 156.13 187.99 171.90 65536 640 266.30 319.63 294.75 131072 320 313.48 481.23 415.49 262144 160 544.97 935.01 769.61 524288 80 924.30 1781.62 1542.83 1048576 40 1705.60 4112.83 3455.91 2097152 20 2358.39 9258.34 7678.96 4194304 10 5087.71 18700.29 14646.52 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.15 0.17 0.16 4 1000 0.84 0.96 0.90 8 1000 3.29 3.29 3.29 16 1000 1.33 1.33 1.33 32 1000 1.32 1.32 1.32 64 1000 1.33 1.33 1.33 128 1000 1.38 1.38 1.38 256 1000 1.40 1.40 1.40 512 1000 1.57 1.57 1.57 1024 1000 1.84 1.84 1.84 2048 1000 5.51 5.51 5.51 4096 1000 2.32 2.32 2.32 8192 1000 8.35 8.35 8.35 16384 1000 11.92 11.92 11.92 32768 1000 18.98 18.98 18.98 65536 640 29.75 29.75 29.75 131072 320 365.75 365.99 365.87 262144 160 456.43 457.05 456.74 524288 80 612.63 615.70 614.16 1048576 40 660.00 661.42 660.71 2097152 20 1502.65 1507.35 1505.00 4194304 10 3132.30 3143.10 3137.70 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.16 0.18 0.17 4 1000 0.76 0.96 0.86 8 1000 1.42 1.67 1.55 16 1000 1.90 1.90 1.90 32 1000 1.88 1.88 1.88 64 1000 5.13 5.13 5.13 128 1000 2.00 2.00 2.00 256 1000 1.93 1.93 1.93 512 1000 2.12 2.12 2.12 1024 1000 2.49 2.49 2.49 2048 1000 2.76 2.76 2.76 4096 1000 9.57 9.57 9.57 8192 1000 4.20 4.20 4.20 16384 1000 10.58 10.58 10.58 32768 1000 22.02 22.02 22.02 65536 640 46.76 46.76 46.76 131072 320 515.32 515.82 515.53 262144 160 956.97 958.48 957.67 524288 80 1186.20 1191.68 1189.04 1048576 40 1652.52 1664.35 1657.06 2097152 20 2076.54 2083.95 2080.10 4194304 10 5285.22 5296.21 5290.76 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.17 0.18 0.17 4 1000 0.77 1.13 0.93 8 1000 2.48 7.03 5.11 16 1000 8.13 14.98 11.61 32 1000 2.51 2.51 2.51 64 1000 7.99 7.99 7.99 128 1000 6.20 6.20 6.20 256 1000 2.65 2.65 2.65 512 1000 2.72 2.72 2.72 1024 1000 3.12 3.12 3.12 2048 1000 7.70 7.70 7.70 4096 1000 10.92 10.92 10.92 8192 1000 11.73 11.73 11.73 16384 1000 18.80 18.80 18.80 32768 1000 23.96 23.96 23.96 65536 640 59.55 59.55 59.55 131072 320 641.75 642.03 641.87 262144 160 171.19 171.21 171.21 524288 80 2046.50 2053.54 2050.31 1048576 40 2626.17 2642.43 2635.15 2097152 20 3943.05 3974.40 3956.93 4194304 10 6664.42 6754.71 6719.76 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.20 0.21 0.20 4 1000 0.79 1.32 1.11 8 1000 11.07 21.59 15.15 16 1000 8.93 11.58 10.27 32 1000 5.87 29.02 17.64 64 1000 11.58 11.58 11.58 128 1000 16.41 16.41 16.41 256 1000 17.09 17.10 17.09 512 1000 12.23 12.23 12.23 1024 1000 11.93 11.93 11.93 2048 1000 14.29 14.29 14.29 4096 1000 19.70 19.70 19.70 8192 1000 21.77 21.77 21.77 16384 1000 31.52 31.52 31.52 32768 1000 49.14 49.15 49.15 65536 640 90.49 90.50 90.50 131072 320 684.41 684.94 684.69 262144 160 615.59 615.66 615.62 524288 80 1111.21 1116.58 1114.05 1048576 40 2443.33 2453.90 2449.29 2097152 20 4996.35 5055.75 5024.16 4194304 10 8005.29 8165.60 8097.89 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.28 0.33 0.30 4 1000 1.48 3.02 2.08 8 1000 1.30 86.00 45.01 16 1000 1.31 94.31 34.39 32 1000 1.37 115.73 37.89 64 1000 1.37 65.10 33.28 128 1000 120.73 120.91 120.82 256 1000 111.74 112.01 111.94 512 1000 117.11 117.19 117.16 1024 1000 102.79 102.88 102.86 2048 1000 134.80 135.06 134.95 4096 1000 139.04 139.06 139.05 8192 1000 144.04 144.24 144.15 16384 1000 129.56 129.74 129.65 32768 1000 178.73 178.83 178.78 65536 640 174.70 175.06 174.87 131072 320 1735.18 1736.59 1736.10 262144 160 1807.46 1808.84 1808.30 524288 80 2256.06 2260.48 2258.39 1048576 40 3276.33 3284.00 3278.79 2097152 20 22577.75 22623.50 22597.52 4194304 10 23751.59 23927.50 23850.51 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.40 0.46 0.43 4 1000 1.18 18.68 3.91 8 1000 2.35 170.62 62.42 16 1000 2.35 213.42 56.78 32 1000 2.47 195.62 64.66 64 1000 2.23 107.23 55.78 128 1000 119.72 121.00 120.40 256 1000 140.13 140.41 140.31 512 1000 187.12 187.34 187.23 1024 1000 191.46 191.65 191.57 2048 1000 209.09 209.26 209.16 4096 1000 223.62 223.98 223.84 8192 1000 231.68 231.85 231.78 16384 1000 203.19 203.32 203.23 32768 1000 229.88 230.10 229.92 65536 640 291.32 291.53 291.43 131072 320 2661.53 2663.27 2662.63 262144 160 3164.96 3184.53 3174.19 524288 80 1436.12 1444.38 1440.73 1048576 40 3000.55 3023.10 3011.78 2097152 20 6622.85 6886.05 6726.89 4194304 10 14757.70 15412.62 15121.46 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.66 0.77 0.70 4 1000 1.46 36.23 6.36 8 1000 3.56 252.28 73.41 16 1000 3.56 289.09 63.78 32 1000 4.08 249.61 75.51 64 1000 7.77 134.83 69.33 128 1000 80.41 104.22 100.92 256 1000 143.94 145.17 144.54 512 1000 315.35 315.88 315.72 1024 1000 322.63 323.01 322.89 2048 1000 319.85 320.26 320.14 4096 1000 325.17 325.57 325.37 8192 1000 332.30 332.76 332.59 16384 1000 352.91 353.59 353.22 32768 1000 395.13 395.60 395.36 65536 640 437.65 438.35 438.06 131072 320 3056.88 3058.84 3057.91 262144 160 3956.87 3961.23 3959.08 524288 80 3016.41 3030.40 3024.67 1048576 40 7098.12 7155.53 7128.67 2097152 20 13189.95 13368.20 13271.58 4194304 10 23832.70 25026.61 24381.07 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 1.16 1.77 1.23 4 1000 18.44 159.53 79.43 8 1000 32.30 272.67 54.87 16 1000 24.02 293.35 68.05 32 1000 21.51 269.92 85.99 64 1000 16.47 161.88 82.41 128 1000 80.93 108.25 104.13 256 1000 80.33 151.92 142.31 512 1000 376.06 385.09 382.83 1024 1000 456.20 456.83 456.44 2048 1000 411.00 411.59 411.31 4096 1000 414.33 414.80 414.59 8192 1000 421.44 422.16 421.80 16384 1000 439.43 439.95 439.80 32768 1000 467.55 468.24 467.96 65536 640 518.77 519.66 519.16 131072 320 3770.44 3772.63 3771.75 262144 160 4648.10 4654.05 4651.76 524288 80 6897.85 6923.16 6914.32 1048576 40 5106.15 5170.67 5148.19 2097152 20 23573.05 23885.36 23759.63 4194304 10 39901.19 41022.40 40537.02 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 1.42 4.18 1.68 4 1000 3.23 226.04 94.77 8 1000 2.82 219.19 96.02 16 1000 2.26 274.39 90.04 32 1000 6.44 329.50 83.03 64 1000 10.43 215.18 90.96 128 1000 84.28 115.37 105.44 256 1000 80.82 198.54 164.01 512 1000 79.90 323.63 264.15 1024 1000 382.14 393.44 391.44 2048 1000 634.00 634.55 634.36 4096 1000 1055.55 1056.44 1056.16 8192 1000 1969.33 1970.69 1970.13 16384 1000 4812.72 4817.21 4815.89 32768 1000 8443.64 8449.24 8447.09 65536 640 10123.44 10131.40 10127.73 131072 320 17933.01 17954.11 17943.12 262144 160 25902.63 25968.83 25943.00 524288 80 47050.19 47233.45 47155.24 1048576 40 6818.15 6897.05 6857.39 2097152 20 31799.29 31929.79 31878.18 4194304 10 50974.70 51982.31 51730.26 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 0.67 0.67 0.67 2 1000 0.67 0.67 0.67 4 1000 0.66 0.66 0.66 8 1000 0.65 0.65 0.65 16 1000 0.69 0.69 0.69 32 1000 0.71 0.71 0.71 64 1000 0.72 0.72 0.72 128 1000 1.91 1.91 1.91 256 1000 0.74 0.74 0.74 512 1000 0.84 0.84 0.84 1024 1000 0.97 0.97 0.97 2048 1000 1.21 1.21 1.21 4096 1000 1.77 1.77 1.77 8192 1000 2.87 2.87 2.87 16384 1000 11.40 11.40 11.40 32768 1000 19.67 19.67 19.67 65536 640 285.14 285.27 285.20 131072 320 379.62 380.13 379.88 262144 160 502.66 503.17 502.92 524288 80 522.29 523.43 522.86 1048576 40 1364.95 1367.15 1366.05 2097152 20 2552.01 2552.95 2552.48 4194304 10 5344.49 5350.49 5347.49 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 1.21 1.21 1.21 2 1000 1.19 1.19 1.19 4 1000 1.19 1.19 1.19 8 1000 3.36 3.36 3.36 16 1000 1.22 1.22 1.22 32 1000 1.24 1.24 1.24 64 1000 1.33 1.33 1.33 128 1000 7.31 7.31 7.31 256 1000 5.53 5.53 5.53 512 1000 6.32 6.32 6.32 1024 1000 8.04 8.04 8.04 2048 1000 3.41 3.41 3.41 4096 1000 11.45 11.45 11.45 8192 1000 12.26 12.26 12.26 16384 1000 33.74 33.75 33.74 32768 1000 56.00 56.01 56.00 65536 640 475.47 475.67 475.57 131072 320 489.26 489.77 489.50 262144 160 1212.05 1214.04 1213.11 524288 80 1454.69 1458.81 1456.62 1048576 40 4874.15 4886.08 4879.00 2097152 20 10238.35 10313.65 10278.05 4194304 10 20536.59 20731.81 20633.60 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 1.82 1.82 1.82 2 1000 1.80 1.80 1.80 4 1000 1.80 1.80 1.80 8 1000 3.92 3.92 3.92 16 1000 1.78 1.78 1.78 32 1000 1.87 1.87 1.87 64 1000 1.98 1.98 1.98 128 1000 16.93 16.94 16.94 256 1000 11.66 11.66 11.66 512 1000 23.60 23.60 23.60 1024 1000 10.19 10.19 10.19 2048 1000 20.29 20.29 20.29 4096 1000 28.98 28.98 28.98 8192 1000 61.24 61.24 61.24 16384 1000 139.19 139.19 139.19 32768 1000 280.99 281.16 281.03 65536 640 1145.55 1145.97 1145.71 131072 320 1282.07 1282.71 1282.37 262144 160 3522.22 3526.59 3524.30 524288 80 5143.78 5145.59 5144.62 1048576 40 13113.52 13177.22 13153.77 2097152 20 25923.35 26176.50 26035.83 4194304 10 49506.69 50080.99 49769.44 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 9.88 9.88 9.88 2 1000 11.45 11.46 11.46 4 1000 10.10 10.10 10.10 8 1000 11.16 11.16 11.16 16 1000 11.16 11.16 11.16 32 1000 11.32 11.32 11.32 64 1000 16.93 16.93 16.93 128 1000 50.11 50.12 50.11 256 1000 61.58 61.59 61.58 512 1000 53.43 53.45 53.44 1024 1000 73.33 73.34 73.34 2048 1000 84.47 84.49 84.48 4096 1000 128.91 128.93 128.92 8192 1000 197.26 197.30 197.28 16384 1000 318.48 318.82 318.73 32768 1000 539.24 539.33 539.29 65536 640 1688.59 1689.59 1689.10 131072 320 4277.46 4284.11 4280.74 262144 160 7455.44 7473.35 7463.41 524288 80 11573.17 11617.91 11593.67 1048576 40 28774.00 28989.37 28894.78 2097152 20 55436.60 56211.20 55935.09 4194304 10 108286.81 111161.40 109837.59 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 148.85 149.10 148.94 2 1000 158.35 158.61 158.48 4 1000 148.64 148.66 148.65 8 1000 142.99 143.24 143.10 16 1000 151.17 151.51 151.30 32 1000 167.16 167.33 167.26 64 1000 147.50 147.75 147.63 128 1000 289.36 289.74 289.50 256 1000 279.05 279.52 279.28 512 1000 285.43 285.65 285.61 1024 1000 294.75 294.97 294.92 2048 1000 322.13 322.35 322.25 4096 1000 378.55 378.95 378.81 8192 1000 503.81 504.47 504.34 16384 1000 818.25 818.85 818.66 32768 1000 1884.44 1885.52 1885.27 65536 640 9786.58 9790.93 9789.05 131072 320 12729.65 12739.42 12735.35 262144 160 17705.42 17734.88 17723.13 524288 80 34501.55 34609.41 34571.49 1048576 40 79061.53 79586.15 79377.83 2097152 20 158125.90 160085.50 159449.79 4194304 10 302083.09 310477.28 307521.85 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 174.32 174.53 174.42 2 1000 182.69 182.78 182.71 4 1000 190.08 190.22 190.19 8 1000 214.38 214.67 214.54 16 1000 210.87 211.18 211.05 32 1000 398.78 399.04 398.84 64 1000 534.99 535.30 535.10 128 1000 575.87 576.37 576.13 256 1000 566.78 567.55 567.28 512 1000 594.42 594.81 594.61 1024 1000 624.24 624.83 624.49 2048 1000 682.41 682.94 682.65 4096 1000 848.25 848.96 848.54 8192 1000 1324.31 1325.03 1324.78 16384 1000 2611.77 2613.27 2612.68 32768 1000 5078.73 5081.01 5080.16 65536 380 26137.57 26150.08 26144.84 131072 320 29802.08 29822.72 29814.94 262144 160 35557.37 35605.05 35587.58 524288 80 67886.84 68027.96 67957.87 1048576 40 156810.67 157527.30 157152.53 2097152 20 312123.94 313950.50 313139.12 4194304 10 605073.29 613876.20 610642.34 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 637.43 638.08 637.63 2 1000 272.20 272.62 272.44 4 1000 663.77 664.37 663.98 8 1000 657.17 657.41 657.31 16 1000 462.14 462.33 462.28 32 1000 712.68 713.06 712.86 64 1000 757.66 757.97 757.79 128 1000 1402.12 1403.31 1402.70 256 1000 1411.71 1413.07 1412.33 512 1000 1454.33 1455.22 1454.81 1024 1000 1472.27 1473.58 1472.98 2048 1000 1616.75 1618.10 1617.33 4096 1000 2013.08 2014.13 2013.65 8192 1000 4200.52 4202.59 4201.41 16384 1000 7956.55 7959.42 7958.47 32768 679 14332.27 14343.00 14338.27 65536 178 56204.19 56295.11 56260.35 131072 155 65741.06 65856.45 65805.18 262144 106 94087.98 94346.68 94240.83 524288 50 198753.06 199976.96 199418.46 1048576 23 439125.48 444458.99 442068.08 2097152 12 866350.25 886471.49 878638.98 4194304 6 1655120.65 1727599.66 1701203.48 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 787.46 787.88 787.67 2 1000 783.97 784.47 784.19 4 1000 801.42 802.02 801.65 8 1000 522.94 523.14 523.01 16 1000 845.54 846.13 845.83 32 1000 837.98 838.42 838.20 64 1000 820.18 820.64 820.39 128 1000 838.08 838.65 838.34 256 1000 3185.71 3188.48 3187.13 512 1000 3446.03 3448.18 3447.10 1024 1000 3682.98 3685.87 3684.48 2048 1000 4298.40 4300.66 4299.52 4096 1000 6901.02 6903.28 6902.24 8192 1000 9588.30 9589.57 9588.99 16384 558 17622.70 17631.95 17628.76 32768 336 30124.48 30151.10 30136.36 65536 87 119477.29 119947.66 119742.31 131072 74 137872.01 138457.76 138204.84 262144 54 187022.94 187895.39 187476.65 524288 24 408775.42 412484.59 410628.78 1048576 12 882607.42 892341.42 889007.69 2097152 6 1736131.51 1783101.00 1766503.89 4194304 out-of-mem.; needed X= 1.005 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.07 0.06 1 1000 419.35 419.84 419.58 2 1000 423.08 423.66 423.34 4 1000 760.43 760.88 760.62 8 1000 450.87 451.50 451.14 16 1000 893.68 894.25 893.93 32 1000 960.22 960.68 960.49 64 1000 863.98 864.53 864.28 128 1000 894.16 894.68 894.46 256 1000 1004.55 1005.02 1004.84 512 1000 5026.61 5029.65 5027.99 1024 1000 5537.81 5541.71 5539.75 2048 1000 6890.84 6893.73 6892.24 4096 891 11315.84 11321.90 11319.71 8192 624 15669.08 15679.03 15674.94 16384 332 29037.85 29058.79 29048.76 32768 19 504640.64 532192.42 518457.36 65536 19 172859.37 176351.42 174790.58 131072 19 204123.95 207890.21 206144.81 262144 19 277018.57 281563.63 279588.47 524288 16 591550.44 604590.18 598497.98 1048576 8 1298363.00 1339189.74 1324782.82 2097152 4 2512975.99 2669534.98 2620468.64 4194304 out-of-mem.; needed X= 1.505 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.07 0.07 1 1000 0.90 0.90 0.90 2 1000 0.91 0.91 0.91 4 1000 0.87 0.87 0.87 8 1000 0.90 0.90 0.90 16 1000 0.88 0.88 0.88 32 1000 0.87 0.87 0.87 64 1000 0.92 0.92 0.92 128 1000 2.95 2.95 2.95 256 1000 1.04 1.04 1.04 512 1000 1.17 1.17 1.17 1024 1000 1.05 1.05 1.05 2048 1000 4.70 4.70 4.70 4096 1000 1.84 1.84 1.84 8192 1000 16.95 16.95 16.95 16384 1000 11.73 11.73 11.73 32768 1000 32.95 32.96 32.95 65536 640 367.10 367.23 367.16 131072 320 471.93 472.29 472.11 262144 160 500.01 502.18 501.09 524288 80 516.99 518.00 517.49 1048576 40 1360.17 1362.45 1361.31 2097152 20 2560.25 2564.19 2562.22 4194304 10 5474.90 5482.01 5478.45 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.08 0.08 1 1000 2.70 2.70 2.70 2 1000 4.74 4.74 4.74 4 1000 1.56 1.56 1.56 8 1000 1.59 1.59 1.59 16 1000 3.81 3.81 3.81 32 1000 1.60 1.60 1.60 64 1000 1.64 1.64 1.64 128 1000 1.71 1.71 1.71 256 1000 6.58 6.58 6.58 512 1000 2.18 2.18 2.18 1024 1000 8.37 8.37 8.37 2048 1000 3.50 3.50 3.50 4096 1000 11.36 11.36 11.36 8192 1000 22.33 22.33 22.33 16384 1000 31.31 31.31 31.31 32768 1000 482.83 483.17 483.01 65536 640 833.35 833.54 833.44 131072 320 480.14 480.41 480.26 262144 160 1216.88 1219.90 1218.13 524288 80 1435.03 1437.04 1436.04 1048576 40 4834.45 4847.82 4841.46 2097152 20 10459.26 10536.91 10496.13 4194304 10 21227.41 21401.69 21311.80 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.09 0.10 0.09 1 1000 6.23 6.23 6.23 2 1000 2.44 2.44 2.44 4 1000 2.35 2.35 2.35 8 1000 2.36 2.36 2.36 16 1000 2.43 2.43 2.43 32 1000 2.48 2.48 2.48 64 1000 2.56 2.56 2.56 128 1000 2.71 2.71 2.71 256 1000 3.09 3.09 3.09 512 1000 3.55 3.55 3.55 1024 1000 22.23 22.23 22.23 2048 1000 19.46 19.46 19.46 4096 1000 32.51 32.51 32.51 8192 1000 48.85 48.85 48.85 16384 1000 147.94 148.25 148.08 32768 1000 984.14 984.95 984.52 65536 640 1152.02 1152.17 1152.10 131072 320 1345.27 1347.51 1346.35 262144 160 3556.29 3559.45 3558.13 524288 80 5221.71 5228.22 5224.75 1048576 40 13138.75 13209.35 13183.29 2097152 20 26133.50 26325.61 26208.51 4194304 10 51323.41 51611.40 51449.67 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.12 0.12 0.12 1 1000 9.89 9.89 9.89 2 1000 12.27 12.28 12.27 4 1000 11.07 11.07 11.07 8 1000 12.20 12.21 12.21 16 1000 12.43 12.43 12.43 32 1000 10.57 10.57 10.57 64 1000 18.33 18.33 18.33 128 1000 17.94 17.94 17.94 256 1000 13.44 13.44 13.44 512 1000 27.08 27.09 27.08 1024 1000 69.86 69.87 69.87 2048 1000 87.39 87.40 87.39 4096 1000 128.76 128.79 128.78 8192 1000 365.67 366.07 365.83 16384 1000 310.99 311.06 311.02 32768 1000 540.80 540.90 540.85 65536 640 1698.73 1699.75 1699.28 131072 320 4194.92 4199.08 4196.57 262144 160 7495.65 7513.58 7505.51 524288 80 11587.66 11622.95 11605.52 1048576 40 28696.20 28886.15 28811.27 2097152 20 56052.85 56773.40 56431.89 4194304 10 108632.71 111559.92 110306.70 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.17 0.19 0.18 1 1000 127.50 127.59 127.55 2 1000 122.06 122.07 122.06 4 1000 128.08 128.33 128.18 8 1000 118.16 118.33 118.24 16 1000 102.18 102.34 102.31 32 1000 141.35 141.61 141.47 64 1000 127.36 127.41 127.40 128 1000 134.81 135.07 134.94 256 1000 135.80 136.15 135.91 512 1000 163.24 163.26 163.25 1024 1000 291.94 292.25 292.14 2048 1000 315.34 315.77 315.58 4096 1000 377.40 377.84 377.73 8192 1000 923.95 924.49 924.20 16384 1000 808.91 809.58 809.34 32768 1000 1874.73 1876.08 1875.33 65536 640 9771.38 9776.04 9774.07 131072 320 12692.35 12702.51 12698.57 262144 160 18047.45 18078.19 18064.88 524288 80 34647.00 34773.79 34729.72 1048576 40 79243.20 79733.07 79559.36 2097152 20 155895.85 157857.20 157239.76 4194304 10 304299.40 311634.80 309279.36 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.29 0.33 0.30 1 1000 184.15 184.52 184.33 2 1000 182.54 182.72 182.68 4 1000 187.42 187.60 187.50 8 1000 182.76 182.90 182.79 16 1000 437.84 438.56 438.19 32 1000 437.05 437.46 437.26 64 1000 443.03 443.74 443.42 128 1000 454.87 455.30 455.06 256 1000 572.12 572.71 572.47 512 1000 592.56 593.14 592.82 1024 1000 628.01 628.80 628.33 2048 1000 693.31 694.07 693.75 4096 1000 857.51 858.39 857.98 8192 1000 1270.15 1270.98 1270.61 16384 1000 2640.39 2641.15 2640.95 32768 1000 4656.74 4658.20 4657.63 65536 377 26505.64 26515.76 26511.04 131072 320 30311.04 30329.31 30323.31 262144 160 36082.18 36121.33 36106.35 524288 80 68054.11 68190.94 68127.48 1048576 40 156358.55 157015.00 156700.95 2097152 20 311019.80 313681.50 312265.36 4194304 10 602639.91 611378.29 607575.84 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.49 0.52 0.50 1 1000 477.71 478.06 477.86 2 1000 481.52 481.98 481.78 4 1000 319.51 319.91 319.71 8 1000 482.81 483.23 482.91 16 1000 489.35 489.65 489.55 32 1000 383.64 384.37 384.04 64 1000 510.21 510.61 510.44 128 1000 534.54 534.88 534.74 256 1000 737.79 738.08 737.94 512 1000 1311.06 1311.48 1311.25 1024 1000 1521.34 1521.97 1521.60 2048 1000 2199.03 2199.79 2199.39 4096 1000 2823.38 2824.41 2823.85 8192 1000 4789.98 4791.83 4790.89 16384 1000 9208.94 9213.06 9210.76 32768 518 19291.19 19308.54 19297.31 65536 257 38889.55 38968.11 38919.71 131072 130 76699.91 76983.04 76814.47 262144 65 154232.68 155377.64 154817.21 524288 31 318506.71 322833.91 320506.33 1048576 23 438543.17 443889.17 441526.39 2097152 12 866909.50 886624.59 878793.19 4194304 6 1651707.69 1725107.51 1698454.55 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.89 2.14 0.91 1 1000 536.49 536.96 536.76 2 1000 534.45 534.87 534.65 4 1000 418.38 418.88 418.62 8 1000 542.68 543.12 542.88 16 1000 551.52 551.82 551.73 32 1000 631.08 631.72 631.40 64 1000 1118.36 1119.24 1118.80 128 1000 826.45 826.77 826.63 256 1000 1486.39 1487.13 1486.68 512 1000 1699.99 1700.89 1700.33 1024 1000 2546.51 2547.29 2546.83 2048 1000 3837.47 3838.50 3837.97 4096 1000 5998.32 6000.73 5999.49 8192 913 10984.76 10989.72 10986.94 16384 483 20638.49 20656.94 20645.15 32768 246 40422.46 40497.71 40456.40 65536 123 80421.28 80724.80 80565.08 131072 62 161682.00 162816.53 162269.33 262144 30 331900.43 336502.00 334314.72 524288 14 668166.77 687060.07 677950.12 1048576 12 881004.10 892233.41 888299.34 2097152 6 1737635.02 1781526.84 1765801.95 4194304 out-of-mem.; needed X= 1.005 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 1.10 1.34 1.24 1 1000 545.56 546.01 545.76 2 1000 548.18 548.57 548.37 4 1000 457.02 457.61 457.32 8 1000 559.54 559.93 559.75 16 1000 568.80 569.32 569.05 32 1000 821.30 822.01 821.71 64 1000 1624.76 1626.29 1625.63 128 1000 973.54 974.04 973.79 256 1000 1687.43 1688.24 1687.73 512 1000 2461.79 2462.85 2462.32 1024 1000 3711.61 3712.79 3712.12 2048 1000 5364.87 5366.64 5365.79 4096 1000 8407.63 8412.34 8410.07 8192 576 17288.92 17307.27 17298.27 16384 308 32368.71 32425.59 32393.64 32768 157 63475.18 63697.78 63578.25 65536 80 123989.34 124820.31 124400.12 131072 39 256332.36 259970.10 258260.04 262144 19 515755.00 530583.73 523453.14 524288 9 1012452.34 1073454.01 1043948.34 1048576 8 1297860.12 1337790.37 1324639.77 2097152 4 2519704.28 2669137.48 2621454.33 4194304 out-of-mem.; needed X= 1.505 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Gather # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 0.45 0.45 0.45 2 1000 0.44 0.44 0.44 4 1000 0.46 0.46 0.46 8 1000 0.44 0.44 0.44 16 1000 0.46 0.46 0.46 32 1000 0.54 0.54 0.54 64 1000 0.57 0.57 0.57 128 1000 0.57 0.57 0.57 256 1000 1.64 1.64 1.64 512 1000 0.68 0.68 0.68 1024 1000 0.80 0.80 0.80 2048 1000 1.06 1.06 1.06 4096 1000 1.49 1.49 1.49 8192 1000 2.33 2.33 2.33 16384 1000 10.76 10.76 10.76 32768 1000 18.24 18.24 18.24 65536 640 134.36 134.49 134.43 131072 320 128.87 129.13 129.00 262144 160 189.79 191.31 190.55 524288 80 228.79 229.81 229.30 1048576 40 608.35 610.50 609.42 2097152 20 1320.45 1321.54 1320.99 4194304 10 2812.79 2819.99 2816.39 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 0.47 0.47 0.47 2 1000 0.47 0.47 0.47 4 1000 0.46 0.46 0.46 8 1000 0.46 0.46 0.46 16 1000 0.48 0.48 0.48 32 1000 0.52 0.53 0.52 64 1000 0.53 0.53 0.53 128 1000 1.87 1.87 1.87 256 1000 0.61 0.61 0.61 512 1000 0.68 0.68 0.68 1024 1000 1.71 1.71 1.71 2048 1000 1.06 1.06 1.06 4096 1000 1.68 1.68 1.68 8192 1000 8.48 8.48 8.48 16384 1000 11.32 11.33 11.32 32768 1000 12.88 12.89 12.89 65536 640 104.10 104.53 104.32 131072 320 178.37 179.01 178.64 262144 160 344.58 346.15 345.53 524288 80 386.57 393.76 390.62 1048576 40 1215.62 1245.37 1233.47 2097152 20 2514.76 2631.00 2585.63 4194304 10 4990.10 5629.80 5387.97 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 0.50 0.50 0.50 2 1000 0.49 0.49 0.49 4 1000 0.49 0.49 0.49 8 1000 0.47 0.47 0.47 16 1000 0.50 0.51 0.51 32 1000 0.51 0.51 0.51 64 1000 0.52 0.53 0.53 128 1000 0.62 0.62 0.62 256 1000 0.65 0.66 0.65 512 1000 0.72 0.72 0.72 1024 1000 0.83 0.83 0.83 2048 1000 1.11 1.11 1.11 4096 1000 1.70 1.70 1.70 8192 1000 2.82 2.84 2.83 16384 1000 11.19 11.21 11.20 32768 1000 19.29 19.33 19.31 65536 640 301.70 302.97 302.45 131072 320 333.69 335.19 334.48 262144 160 467.01 476.17 470.87 524288 80 832.31 860.80 848.15 1048576 40 1805.78 1981.85 1911.61 2097152 20 3242.55 3844.11 3593.05 4194304 10 8821.01 10949.49 10043.17 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 1.03 1.04 1.03 2 1000 1.04 1.04 1.04 4 1000 1.04 1.04 1.04 8 1000 1.05 1.06 1.06 16 1000 1.04 1.05 1.05 32 1000 1.20 1.20 1.20 64 1000 1.23 1.23 1.23 128 1000 1.45 1.47 1.46 256 1000 1.51 1.52 1.51 512 1000 1.75 1.76 1.75 1024 1000 2.15 2.16 2.16 2048 1000 2.87 2.89 2.88 4096 1000 12.23 12.26 12.25 8192 1000 19.27 19.32 19.30 16384 1000 40.12 40.22 40.17 32768 1000 72.21 72.97 72.50 65536 640 121.26 122.39 122.02 131072 320 368.58 371.99 370.95 262144 160 622.34 633.64 629.41 524288 80 1703.51 1773.05 1754.31 1048576 40 4475.38 4774.73 4634.87 2097152 20 7856.70 9176.40 8578.50 4194304 10 30432.32 33675.31 32068.76 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 126.10 126.69 126.48 2 1000 128.30 128.97 128.65 4 1000 129.17 129.90 129.53 8 1000 127.49 128.18 127.84 16 1000 127.46 128.19 127.83 32 1000 128.31 129.06 128.70 64 1000 128.22 128.89 128.62 128 1000 129.94 130.52 130.28 256 1000 129.24 129.94 129.60 512 1000 134.95 135.70 135.33 1024 1000 145.55 146.27 145.91 2048 1000 50.35 51.53 50.74 4096 1000 48.96 50.15 49.28 8192 1000 50.81 52.45 51.25 16384 1000 60.10 62.15 60.94 32768 1000 99.44 101.01 99.97 65536 640 422.61 429.06 426.22 131072 320 565.71 577.43 571.62 262144 160 882.31 925.37 907.55 524288 80 2946.36 3036.40 2995.81 1048576 40 6077.45 6570.33 6330.63 2097152 20 14548.35 16501.70 15624.73 4194304 10 52183.20 58202.98 55029.23 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 146.69 147.55 147.11 2 1000 147.56 148.75 148.11 4 1000 146.94 148.00 147.50 8 1000 146.69 147.91 147.46 16 1000 147.27 148.15 147.78 32 1000 149.38 150.45 149.99 64 1000 133.40 135.24 134.32 128 1000 137.34 139.07 138.12 256 1000 139.21 141.23 140.17 512 1000 143.96 145.79 144.82 1024 1000 278.70 280.83 279.69 2048 1000 74.36 79.10 77.05 4096 1000 77.95 82.22 79.91 8192 1000 103.81 109.76 106.68 16384 1000 212.75 222.01 217.58 32768 1000 344.77 353.29 350.13 65536 640 824.65 832.82 830.33 131072 320 1436.06 1468.58 1455.98 262144 160 2203.33 2494.12 2401.90 524288 80 4747.97 5228.42 5029.87 1048576 40 10530.32 11179.55 10971.16 2097152 20 14661.15 26736.20 22162.37 4194304 10 61987.02 74918.41 69752.32 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 173.42 175.03 174.42 2 1000 171.46 172.64 172.13 4 1000 171.53 172.80 172.12 8 1000 173.31 175.05 174.29 16 1000 176.25 177.78 177.09 32 1000 175.91 177.38 176.63 64 1000 142.41 153.25 148.61 128 1000 142.98 154.26 149.59 256 1000 149.25 160.93 156.04 512 1000 164.31 176.47 171.39 1024 1000 614.15 619.19 616.18 2048 1000 147.93 160.84 155.38 4096 1000 181.62 198.22 190.58 8192 1000 288.74 314.35 303.68 16384 1000 878.73 953.00 919.58 32768 1000 1031.98 1051.92 1048.80 65536 640 2188.21 2328.94 2316.04 131072 320 3028.67 3981.53 3873.66 262144 160 2839.13 6374.85 6073.69 524288 80 10034.07 16983.22 14639.68 1048576 40 22011.90 23220.32 22884.12 2097152 20 14401.45 51512.50 44261.28 4194304 10 28953.48 81778.48 69812.39 #---------------------------------------------------------------- # Benchmarking Gather # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 191.82 193.46 192.65 2 1000 189.62 191.31 190.63 4 1000 191.52 192.97 192.35 8 1000 190.28 191.91 191.19 16 1000 193.89 195.61 194.85 32 1000 166.06 200.08 185.53 64 1000 176.22 211.03 196.17 128 1000 178.59 214.02 198.85 256 1000 192.54 230.02 213.93 512 1000 215.51 257.97 240.06 1024 1000 442.75 561.18 527.56 2048 1000 420.81 524.06 487.24 4096 1000 453.68 556.11 518.82 8192 1000 499.91 621.07 579.16 16384 1000 1164.63 1417.91 1315.36 32768 942 6463.85 8952.42 7770.67 65536 640 5228.73 6066.89 5965.25 131072 320 4806.87 9242.14 8930.11 262144 160 2989.95 16106.09 12976.82 524288 80 21148.54 33669.91 29213.79 1048576 40 44529.05 50100.15 48768.23 2097152 20 13886.95 115128.80 105137.47 4194304 out-of-mem.; needed X= 1.005 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Gather # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.07 0.06 1 1000 182.80 184.60 183.78 2 1000 184.80 186.61 185.75 4 1000 183.87 185.89 184.98 8 1000 185.57 187.44 186.58 16 1000 188.18 190.04 189.13 32 1000 181.27 244.93 217.71 64 1000 196.97 261.78 234.49 128 1000 205.65 272.26 243.95 256 1000 215.06 287.09 256.10 512 1000 261.94 355.86 314.86 1024 1000 549.93 897.82 790.81 2048 1000 501.37 807.80 702.96 4096 1000 640.80 1004.91 885.10 8192 1000 677.98 972.31 859.69 16384 1000 1220.99 1718.80 1533.11 32768 633 6471.97 15191.88 11012.76 65536 633 4145.48 8817.45 7438.85 131072 320 66.75 12827.89 10225.50 262144 160 6119.77 25951.80 22408.86 524288 80 33485.80 51713.26 47580.25 1048576 40 69793.47 78507.02 77569.18 2097152 20 15131.35 180560.40 168919.49 4194304 out-of-mem.; needed X= 1.505 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.15 0.17 0.16 1 1000 0.47 0.48 0.48 2 1000 0.47 0.47 0.47 4 1000 0.48 0.48 0.48 8 1000 0.48 0.48 0.48 16 1000 0.51 0.51 0.51 32 1000 0.53 0.53 0.53 64 1000 0.55 0.55 0.55 128 1000 0.58 0.58 0.58 256 1000 0.65 0.65 0.65 512 1000 0.72 0.72 0.72 1024 1000 0.83 0.83 0.83 2048 1000 1.06 1.06 1.06 4096 1000 1.51 1.51 1.51 8192 1000 4.78 4.78 4.78 16384 1000 3.97 3.97 3.97 32768 1000 11.02 11.02 11.02 65536 640 125.52 125.52 125.52 131072 320 128.35 128.61 128.48 262144 160 164.33 164.82 164.58 524288 80 230.03 230.06 230.04 1048576 40 614.98 616.93 615.95 2097152 20 1299.31 1303.35 1301.33 4194304 10 2830.70 2835.20 2832.95 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.12 0.13 0.12 1 1000 0.50 0.50 0.50 2 1000 0.52 0.52 0.52 4 1000 0.50 0.50 0.50 8 1000 1.55 1.55 1.55 16 1000 0.51 0.52 0.52 32 1000 0.57 0.57 0.57 64 1000 0.56 0.56 0.56 128 1000 0.62 0.62 0.62 256 1000 0.66 0.66 0.66 512 1000 0.73 0.73 0.73 1024 1000 0.84 0.84 0.84 2048 1000 1.09 1.10 1.10 4096 1000 3.63 3.63 3.63 8192 1000 7.28 7.29 7.29 16384 1000 10.89 10.90 10.90 32768 1000 13.10 13.11 13.11 65536 640 104.46 104.88 104.68 131072 320 136.59 137.17 136.92 262144 160 341.78 343.85 343.11 524288 80 388.52 395.24 392.33 1048576 40 1249.00 1292.22 1273.83 2097152 20 2381.75 2520.55 2465.76 4194304 10 5220.58 5825.90 5603.87 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.10 0.11 0.10 1 1000 0.61 0.61 0.61 2 1000 0.62 0.62 0.62 4 1000 0.61 0.62 0.62 8 1000 0.60 0.60 0.60 16 1000 0.60 0.60 0.60 32 1000 1.91 1.92 1.92 64 1000 0.67 0.67 0.67 128 1000 0.70 0.71 0.70 256 1000 2.19 2.19 2.19 512 1000 2.35 2.36 2.36 1024 1000 0.95 0.95 0.95 2048 1000 1.20 1.21 1.21 4096 1000 5.21 5.22 5.22 8192 1000 2.92 2.94 2.93 16384 1000 10.85 10.87 10.86 32768 1000 20.00 20.04 20.02 65536 640 304.86 305.96 305.52 131072 320 331.22 332.98 332.13 262144 160 465.99 473.44 470.22 524288 80 800.33 830.49 816.38 1048576 40 1716.70 1860.67 1794.42 2097152 20 3956.20 4532.80 4278.56 4194304 10 6242.18 8704.50 7661.07 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.09 0.09 0.09 1 1000 1.04 1.04 1.04 2 1000 1.04 1.04 1.04 4 1000 1.04 1.05 1.04 8 1000 1.05 1.06 1.05 16 1000 1.05 1.06 1.05 32 1000 1.21 1.22 1.22 64 1000 1.21 1.22 1.21 128 1000 1.45 1.46 1.46 256 1000 3.81 3.82 3.82 512 1000 1.74 1.76 1.75 1024 1000 2.13 2.15 2.14 2048 1000 2.85 2.87 2.86 4096 1000 14.63 14.66 14.65 8192 1000 18.36 18.41 18.38 16384 1000 33.05 33.17 33.11 32768 1000 67.66 67.88 67.78 65536 640 120.16 121.33 120.94 131072 320 365.44 368.60 367.40 262144 160 575.50 590.42 585.46 524288 80 1736.26 1775.97 1759.70 1048576 40 4747.53 5044.52 4907.99 2097152 20 7794.40 8994.85 8442.20 4194304 10 31037.78 34202.29 32619.26 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 134.49 135.23 134.68 2 1000 130.17 130.82 130.38 4 1000 128.51 129.93 129.07 8 1000 132.02 132.31 132.20 16 1000 130.90 131.42 131.08 32 1000 131.77 132.43 132.05 64 1000 127.68 128.69 127.94 128 1000 130.43 131.86 130.89 256 1000 129.74 130.54 129.98 512 1000 132.89 133.27 133.09 1024 1000 131.57 133.30 132.19 2048 1000 134.64 135.38 134.87 4096 1000 139.06 139.71 139.27 8192 1000 144.13 145.05 144.41 16384 1000 173.99 174.60 174.31 32768 1000 211.99 212.95 212.38 65536 640 426.45 433.87 431.15 131072 320 596.95 612.54 606.15 262144 160 846.59 883.57 866.13 524288 80 2919.59 3064.22 3005.96 1048576 40 8501.32 9013.70 8762.65 2097152 20 15683.81 17557.70 16767.04 4194304 10 42024.99 50077.80 45143.64 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 118.34 120.63 119.10 2 1000 123.37 125.35 124.20 4 1000 113.16 115.25 113.85 8 1000 114.36 116.92 115.47 16 1000 119.40 122.18 120.93 32 1000 110.92 113.25 111.60 64 1000 119.20 120.37 119.74 128 1000 120.40 122.47 121.09 256 1000 113.50 115.34 114.20 512 1000 123.08 125.21 123.88 1024 1000 125.81 128.09 126.57 2048 1000 134.96 137.09 135.93 4096 1000 142.34 146.28 144.67 8192 1000 167.28 169.87 168.72 16384 1000 277.35 286.51 281.97 32768 1000 486.94 499.19 495.72 65536 640 1188.42 1208.14 1201.65 131072 320 1579.08 1670.07 1642.53 262144 160 2324.39 2652.00 2555.98 524288 80 4995.85 5600.19 5361.65 1048576 40 10759.30 11466.85 11247.10 2097152 20 14727.64 26626.90 22131.85 4194304 10 49094.39 64323.90 58390.76 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 135.83 138.91 137.30 2 1000 135.18 138.99 136.87 4 1000 134.44 138.20 136.24 8 1000 138.22 143.96 141.45 16 1000 135.14 139.61 137.09 32 1000 147.94 150.38 149.14 64 1000 135.13 139.99 137.77 128 1000 141.22 143.13 142.30 256 1000 143.08 148.36 145.40 512 1000 151.80 156.97 154.05 1024 1000 150.08 156.61 153.59 2048 1000 163.38 166.67 165.02 4096 1000 207.67 211.60 209.56 8192 1000 382.90 388.35 386.42 16384 1000 834.25 851.66 846.23 32768 1000 1700.43 1710.45 1706.95 65536 640 4380.46 4684.27 4643.65 131072 320 6337.76 6597.41 6557.56 262144 160 10619.18 11647.12 11520.00 524288 80 10593.97 17508.52 15715.45 1048576 40 21021.02 22755.03 22328.14 2097152 20 13554.70 52119.80 44594.44 4194304 10 35392.31 81941.91 71464.66 #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 331.01 336.18 334.60 2 1000 308.08 314.89 312.25 4 1000 317.69 326.18 322.96 8 1000 306.63 317.35 314.26 16 1000 350.70 356.11 354.46 32 1000 290.80 298.12 294.92 64 1000 342.64 350.53 347.15 128 1000 318.37 323.17 322.12 256 1000 315.93 323.00 320.32 512 1000 352.71 360.69 359.57 1024 1000 355.51 363.02 359.72 2048 1000 380.55 392.66 387.94 4096 1000 447.83 454.77 453.19 8192 1000 718.50 728.95 727.34 16384 1000 1930.92 1946.86 1943.45 32768 1000 4383.30 4399.59 4395.27 65536 640 9344.16 9983.37 9943.08 131072 320 12684.89 13242.75 13193.23 262144 160 16666.38 22629.03 22186.09 524288 80 22160.92 34247.70 30621.91 1048576 40 45068.45 51395.72 50459.17 2097152 20 14939.09 115440.55 105538.20 4194304 out-of-mem.; needed X= 1.005 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Gatherv # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.11 0.09 1 1000 1114.51 1406.59 1324.52 2 1000 1230.27 1522.77 1440.52 4 1000 1186.52 1497.64 1410.87 8 1000 1232.67 1535.76 1453.04 16 1000 1237.39 1536.51 1450.99 32 1000 1273.37 1576.46 1492.46 64 1000 1136.89 1423.08 1338.57 128 1000 1129.91 1366.63 1296.76 256 1000 1170.00 1413.13 1336.09 512 1000 1166.13 1427.83 1346.40 1024 1000 1077.24 1309.70 1224.26 2048 1000 750.21 978.78 879.36 4096 1000 764.37 1061.52 926.62 8192 1000 997.67 1425.07 1224.23 16384 1000 1421.49 2129.66 1786.55 32768 968 8442.96 13540.03 12167.45 65536 618 7090.64 18477.08 14706.74 131072 320 67.58 27558.67 22572.51 262144 129 4666.89 76518.33 65627.58 524288 80 48586.00 60944.86 59770.27 1048576 40 83463.65 102748.72 100398.19 2097152 20 13095.86 202932.16 187084.79 4194304 out-of-mem.; needed X= 1.505 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 0.49 0.49 0.49 2 1000 0.48 0.48 0.48 4 1000 0.64 0.64 0.64 8 1000 0.63 0.63 0.63 16 1000 0.63 0.63 0.63 32 1000 0.62 0.62 0.62 64 1000 0.64 0.64 0.64 128 1000 0.66 0.66 0.66 256 1000 0.72 0.72 0.72 512 1000 0.78 0.78 0.78 1024 1000 0.94 0.94 0.94 2048 1000 2.78 2.78 2.78 4096 1000 1.72 1.72 1.72 8192 1000 2.86 2.86 2.86 16384 1000 15.25 15.25 15.25 32768 1000 19.26 19.26 19.26 65536 640 254.31 254.45 254.38 131072 320 294.90 295.16 295.03 262144 160 336.61 337.11 336.86 524288 80 310.50 311.49 310.99 1048576 40 819.60 822.60 821.10 2097152 20 1485.74 1489.79 1487.77 4194304 10 3371.81 3381.90 3376.85 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 0.73 0.73 0.73 2 1000 0.74 0.74 0.74 4 1000 0.64 0.64 0.64 8 1000 0.64 0.64 0.64 16 1000 0.64 0.64 0.64 32 1000 0.65 0.65 0.65 64 1000 0.66 0.66 0.66 128 1000 0.68 0.68 0.68 256 1000 0.74 0.74 0.74 512 1000 0.82 0.82 0.82 1024 1000 0.94 0.94 0.94 2048 1000 3.26 3.26 3.26 4096 1000 1.86 1.86 1.86 8192 1000 3.03 3.04 3.04 16384 1000 15.43 15.43 15.43 32768 1000 20.59 20.60 20.59 65536 640 91.32 91.57 91.45 131072 320 123.63 123.88 123.73 262144 160 172.07 172.59 172.32 524288 80 525.72 529.47 527.81 1048576 40 1934.02 1988.77 1969.17 2097152 20 3918.54 4143.00 4054.95 4194304 10 7815.79 8548.09 8262.77 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.07 1 1000 2.19 2.19 2.19 2 1000 0.90 0.90 0.90 4 1000 0.64 0.64 0.64 8 1000 0.64 0.64 0.64 16 1000 0.66 0.66 0.66 32 1000 0.66 0.66 0.66 64 1000 0.64 0.64 0.64 128 1000 0.72 0.72 0.72 256 1000 0.78 0.78 0.78 512 1000 0.83 0.83 0.83 1024 1000 0.96 0.96 0.96 2048 1000 1.28 1.29 1.29 4096 1000 1.97 1.97 1.97 8192 1000 8.97 8.97 8.97 16384 1000 12.19 12.21 12.20 32768 1000 20.64 20.66 20.65 65536 640 89.49 89.60 89.52 131072 320 164.39 164.74 164.57 262144 160 450.64 465.60 460.18 524288 80 1430.16 1448.95 1440.16 1048576 40 3974.70 4214.80 4110.90 2097152 20 7342.49 7946.10 7652.84 4194304 10 12273.00 14353.42 13453.33 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 2.11 2.12 2.11 2 1000 2.14 2.14 2.14 4 1000 3.59 3.60 3.59 8 1000 1.35 1.35 1.35 16 1000 1.31 1.32 1.32 32 1000 1.33 1.33 1.33 64 1000 1.34 1.34 1.34 128 1000 3.80 3.81 3.81 256 1000 1.64 1.65 1.65 512 1000 1.91 1.92 1.92 1024 1000 2.30 2.31 2.30 2048 1000 3.01 3.02 3.02 4096 1000 15.04 15.06 15.06 8192 1000 18.79 18.81 18.80 16384 1000 37.22 37.91 37.52 32768 1000 60.82 61.42 61.02 65536 640 86.56 87.28 86.92 131072 320 251.17 258.57 256.41 262144 160 549.70 565.69 561.21 524288 80 1198.48 1246.88 1230.14 1048576 40 3512.56 3822.26 3678.76 2097152 20 8595.05 9548.95 9100.59 4194304 10 7832.91 13864.71 11675.60 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 10.71 11.20 10.86 2 1000 10.63 11.02 10.72 4 1000 10.76 10.81 10.79 8 1000 8.56 8.72 8.63 16 1000 7.73 7.91 7.81 32 1000 8.71 8.92 8.82 64 1000 7.72 7.97 7.82 128 1000 7.98 8.19 8.10 256 1000 8.10 8.32 8.22 512 1000 8.89 9.06 8.97 1024 1000 10.98 11.22 11.12 2048 1000 13.64 13.91 13.79 4096 1000 17.45 17.81 17.67 8192 1000 25.67 25.85 25.74 16384 1000 43.25 43.97 43.70 32768 1000 85.35 86.81 85.82 65536 640 584.29 589.51 585.73 131072 320 832.55 845.91 836.91 262144 160 1211.31 1239.63 1228.45 524288 80 4072.15 4239.09 4158.41 1048576 40 11856.30 12544.49 12240.76 2097152 20 11158.41 13685.55 12634.02 4194304 10 36980.70 46748.59 41277.05 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 8.61 9.10 8.86 2 1000 8.91 9.37 9.12 4 1000 8.92 9.38 9.13 8 1000 8.84 9.71 9.32 16 1000 9.03 9.49 9.21 32 1000 8.90 9.31 9.07 64 1000 10.23 10.75 10.45 128 1000 8.94 9.55 9.18 256 1000 10.78 11.21 10.97 512 1000 15.79 16.38 16.09 1024 1000 54.89 56.51 56.12 2048 1000 86.14 88.56 87.95 4096 1000 125.09 128.17 127.40 8192 1000 183.11 187.02 185.89 16384 1000 252.40 258.74 256.89 32768 1000 1090.53 1094.04 1092.54 65536 640 1375.41 1416.95 1403.86 131072 320 1773.11 1842.05 1819.48 262144 160 2824.75 2916.14 2879.17 524288 80 3684.91 4258.14 4087.43 1048576 40 8850.73 10021.02 9591.79 2097152 20 9522.40 19579.85 16224.51 4194304 10 29219.91 43017.89 37860.66 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 11.20 12.30 11.80 2 1000 9.72 11.15 10.35 4 1000 11.47 12.76 12.10 8 1000 10.94 11.58 11.29 16 1000 11.35 12.01 11.65 32 1000 10.60 11.28 10.87 64 1000 11.64 12.62 12.19 128 1000 14.22 15.23 14.75 256 1000 17.34 18.40 17.87 512 1000 31.75 32.13 32.00 1024 1000 150.53 152.05 151.47 2048 1000 170.13 172.45 171.62 4096 1000 248.46 252.34 250.64 8192 1000 414.77 420.32 417.37 16384 1000 1042.41 1052.95 1049.08 32768 1000 1245.58 1253.41 1250.12 65536 640 2080.21 2106.84 2100.04 131072 320 3296.35 3422.68 3399.12 262144 160 6379.29 6512.48 6453.55 524288 80 9734.30 11514.66 11171.09 1048576 40 18647.43 22810.80 21795.33 2097152 20 11784.95 44761.35 38671.13 4194304 10 19902.09 90315.49 75482.39 #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 11.22 12.73 11.90 2 1000 11.11 12.94 12.02 4 1000 10.93 12.46 11.57 8 1000 11.53 12.80 12.17 16 1000 12.02 13.24 12.57 32 1000 12.77 14.85 13.69 64 1000 11.88 13.91 13.23 128 1000 19.31 20.45 19.93 256 1000 36.80 37.31 37.06 512 1000 55.78 56.30 56.08 1024 1000 2897.40 2900.30 2898.97 2048 1000 3848.01 3849.77 3848.85 4096 1000 1029.00 1074.84 1058.46 8192 1000 1004.40 1022.49 1014.98 16384 1000 1544.47 1574.95 1559.61 32768 1000 2073.39 2111.22 2100.03 65536 640 6771.05 6869.35 6858.52 131072 320 14736.01 15152.93 15114.32 262144 160 23851.96 24014.99 23942.50 524288 80 23471.14 27607.05 27095.14 1048576 40 42334.85 51661.78 50086.46 2097152 20 7545.34 98196.70 90117.88 4194304 out-of-mem.; needed X= 1.005 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Scatter # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.07 0.06 1 1000 10.50 12.22 11.46 2 1000 10.49 11.89 11.18 4 1000 11.15 12.16 11.69 8 1000 12.05 13.32 12.68 16 1000 12.68 13.79 13.08 32 1000 11.76 12.99 12.38 64 1000 18.16 19.89 18.99 128 1000 22.78 23.93 23.27 256 1000 51.88 52.50 52.27 512 1000 107.86 108.26 108.09 1024 1000 3431.83 3435.35 3432.98 2048 1000 5052.81 5056.50 5054.41 4096 1000 2007.68 2017.07 2012.28 8192 1000 1851.30 1941.49 1920.91 16384 1000 2105.32 2133.63 2120.03 32768 891 2894.83 2940.28 2909.31 65536 291 16044.64 16134.81 16112.13 131072 291 22151.66 22321.89 22293.31 262144 160 22565.88 22831.87 22708.79 524288 80 38383.66 46893.04 46157.92 1048576 40 80702.60 95324.45 94082.09 2097152 20 8899.75 148156.34 139618.04 4194304 out-of-mem.; needed X= 1.505 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.15 0.17 0.16 1 1000 0.63 0.63 0.63 2 1000 0.64 0.64 0.64 4 1000 0.63 0.63 0.63 8 1000 0.63 0.63 0.63 16 1000 0.64 0.64 0.64 32 1000 0.88 0.88 0.88 64 1000 0.64 0.64 0.64 128 1000 0.66 0.66 0.66 256 1000 0.72 0.72 0.72 512 1000 0.78 0.78 0.78 1024 1000 0.90 0.90 0.90 2048 1000 2.43 2.43 2.43 4096 1000 1.64 1.64 1.64 8192 1000 2.63 2.63 2.63 16384 1000 11.09 11.09 11.09 32768 1000 18.40 18.40 18.40 65536 640 246.43 246.55 246.49 131072 320 285.75 286.01 285.88 262144 160 326.15 326.66 326.41 524288 80 297.41 297.78 297.59 1048576 40 855.73 858.75 857.24 2097152 20 1597.50 1601.60 1599.55 4194304 10 3741.38 3760.19 3750.79 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.12 0.13 0.13 1 1000 0.84 0.85 0.85 2 1000 0.84 0.84 0.84 4 1000 0.83 0.83 0.83 8 1000 0.83 0.83 0.83 16 1000 0.83 0.83 0.83 32 1000 0.83 0.83 0.83 64 1000 0.84 0.84 0.84 128 1000 0.91 0.91 0.91 256 1000 0.97 0.98 0.97 512 1000 1.07 1.07 1.07 1024 1000 1.23 1.24 1.23 2048 1000 1.60 1.60 1.60 4096 1000 2.40 2.40 2.40 8192 1000 3.95 3.96 3.95 16384 1000 11.83 11.84 11.83 32768 1000 22.98 22.99 22.99 65536 640 91.81 91.93 91.90 131072 320 121.47 121.73 121.57 262144 160 205.02 205.81 205.37 524288 80 541.11 549.71 545.88 1048576 40 2159.95 2220.90 2196.87 2097152 20 4291.75 4457.09 4397.79 4194304 10 7140.09 7750.39 7520.15 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.10 0.11 0.11 1 1000 1.35 1.35 1.35 2 1000 1.34 1.34 1.34 4 1000 1.33 1.33 1.33 8 1000 1.34 1.34 1.34 16 1000 1.33 1.33 1.33 32 1000 1.34 1.34 1.34 64 1000 5.02 5.02 5.02 128 1000 1.45 1.45 1.45 256 1000 1.55 1.55 1.55 512 1000 4.06 4.06 4.06 1024 1000 1.99 1.99 1.99 2048 1000 2.64 2.64 2.64 4096 1000 8.78 8.79 8.79 8192 1000 11.35 11.35 11.35 16384 1000 21.97 21.98 21.98 32768 1000 46.73 46.76 46.74 65536 640 91.61 91.77 91.70 131072 320 164.53 165.15 164.75 262144 160 431.07 439.93 434.88 524288 80 1660.17 1728.79 1696.04 1048576 40 3254.72 3470.80 3383.70 2097152 20 5340.55 5739.90 5557.05 4194304 10 7533.10 9180.90 8521.25 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.09 0.09 0.09 1 1000 11.28 11.29 11.29 2 1000 12.16 12.17 12.16 4 1000 11.44 11.45 11.45 8 1000 16.70 16.71 16.71 16 1000 11.33 11.34 11.34 32 1000 11.21 11.22 11.22 64 1000 11.45 11.46 11.46 128 1000 12.77 12.78 12.78 256 1000 16.12 16.13 16.13 512 1000 16.93 16.94 16.93 1024 1000 18.48 18.49 18.48 2048 1000 20.02 20.03 20.03 4096 1000 23.43 23.46 23.44 8192 1000 42.23 42.27 42.25 16384 1000 89.60 89.87 89.69 32768 1000 86.07 86.50 86.31 65536 640 87.57 88.29 87.93 131072 320 263.38 268.01 265.99 262144 160 518.77 530.21 525.35 524288 80 2044.55 2107.79 2087.26 1048576 40 6202.10 6454.32 6341.54 2097152 20 7913.40 8824.30 8429.27 4194304 10 17989.71 20997.50 19708.25 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 21.96 22.37 22.22 2 1000 27.07 27.64 27.45 4 1000 30.58 30.98 30.83 8 1000 31.12 31.39 31.22 16 1000 31.83 32.32 32.14 32 1000 27.40 27.75 27.57 64 1000 24.69 25.16 24.94 128 1000 24.85 25.01 24.92 256 1000 30.58 30.99 30.84 512 1000 32.94 33.46 33.24 1024 1000 34.24 34.82 34.58 2048 1000 41.85 42.48 42.24 4096 1000 49.61 50.10 49.73 8192 1000 77.35 78.40 78.15 16384 1000 99.67 100.34 100.12 32768 1000 194.27 194.91 194.64 65536 640 600.72 606.29 602.05 131072 320 881.68 899.93 889.41 262144 160 1170.80 1199.24 1181.02 524288 80 4539.37 4825.35 4686.56 1048576 40 9938.28 10752.85 10350.06 2097152 20 17735.10 19231.20 18477.06 4194304 10 35891.20 45322.90 40031.39 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 34.67 35.49 35.32 2 1000 34.89 35.80 35.58 4 1000 35.07 35.96 35.74 8 1000 38.29 38.63 38.48 16 1000 37.41 38.34 38.11 32 1000 37.07 38.00 37.86 64 1000 37.33 38.28 38.05 128 1000 39.46 40.45 40.22 256 1000 43.19 43.54 43.38 512 1000 49.33 49.82 49.61 1024 1000 58.20 59.89 59.53 2048 1000 99.33 100.14 99.79 4096 1000 130.76 134.46 133.79 8192 1000 198.25 200.11 199.68 16384 1000 285.37 288.78 288.01 32768 1000 1031.98 1044.59 1039.02 65536 640 1398.03 1415.76 1406.46 131072 320 1801.01 1887.46 1859.81 262144 160 2722.11 2817.89 2763.55 524288 80 3852.70 4436.20 4250.73 1048576 40 8878.67 9696.60 9453.00 2097152 20 9673.64 20018.95 16729.91 4194304 10 24071.50 42391.40 36165.08 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.08 0.09 0.08 1 1000 85.21 85.82 85.59 2 1000 80.42 81.14 81.01 4 1000 78.72 79.58 79.30 8 1000 91.24 91.82 91.60 16 1000 91.57 92.09 91.91 32 1000 91.56 92.26 92.07 64 1000 93.30 93.66 93.55 128 1000 97.08 97.91 97.68 256 1000 123.49 124.16 124.02 512 1000 125.97 126.77 126.61 1024 1000 137.43 137.95 137.82 2048 1000 176.02 176.73 176.56 4096 1000 258.71 259.99 259.49 8192 1000 592.35 595.54 594.39 16384 1000 1586.66 1597.60 1593.80 32768 1000 1239.95 1253.10 1246.40 65536 640 1947.97 1984.07 1971.38 131072 320 3383.55 3508.08 3482.95 262144 160 6378.91 6466.21 6415.77 524288 80 8954.10 10691.72 10346.22 1048576 40 17938.22 21349.53 20446.70 2097152 20 8587.05 44048.70 37788.64 4194304 10 15900.40 85612.70 71510.24 #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.09 0.08 1 1000 523.53 524.66 524.39 2 1000 524.45 525.75 525.51 4 1000 576.92 577.94 577.69 8 1000 559.31 560.37 560.06 16 1000 516.45 517.21 517.07 32 1000 497.17 498.10 497.91 64 1000 541.63 542.78 542.56 128 1000 565.76 566.94 566.73 256 1000 698.09 699.13 698.90 512 1000 647.17 648.12 647.88 1024 1000 650.51 651.76 651.48 2048 1000 667.95 669.84 669.34 4096 1000 745.44 747.98 747.30 8192 1000 1457.75 1469.56 1466.82 16384 1000 3508.70 3540.86 3534.17 32768 1000 1990.17 2004.85 1997.97 65536 640 4426.26 4526.95 4517.02 131072 320 8636.34 9082.91 9043.59 262144 160 13166.16 13376.88 13308.77 524288 80 28294.85 32585.14 32014.41 1048576 40 68459.98 76605.93 75190.12 2097152 20 14414.30 120303.30 110249.94 4194304 out-of-mem.; needed X= 1.005 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Scatterv # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.07 0.10 0.09 1 1000 2343.07 2345.29 2344.74 2 1000 2948.39 2950.58 2950.05 4 1000 2443.08 2456.84 2444.94 8 1000 2783.59 2788.41 2785.87 16 1000 2527.74 2530.02 2529.52 32 1000 2861.83 2864.89 2863.56 64 1000 2537.52 2540.19 2539.16 128 1000 2457.83 2460.28 2459.47 256 1000 2821.27 2829.14 2825.66 512 1000 2822.22 2826.59 2825.51 1024 1000 2712.69 2732.07 2715.75 2048 1000 2862.48 2866.43 2865.02 4096 1000 2667.79 2677.76 2674.13 8192 1000 2651.97 2666.42 2661.09 16384 1000 2384.09 2412.75 2402.73 32768 1000 3921.41 4096.69 3987.85 65536 504 16836.31 17729.72 17675.30 131072 320 25789.87 27360.52 27268.94 262144 160 35012.79 45325.57 44645.73 524288 80 61143.11 78550.07 77153.91 1048576 40 98291.48 133150.50 128756.84 2097152 20 9040.69 206391.35 191998.96 4194304 out-of-mem.; needed X= 1.505 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 3.34 3.34 3.34 2 1000 3.32 3.32 3.32 4 1000 9.77 9.77 9.77 8 1000 8.85 8.85 8.85 16 1000 3.34 3.34 3.34 32 1000 3.34 3.34 3.34 64 1000 3.37 3.37 3.37 128 1000 9.73 9.73 9.73 256 1000 8.73 8.73 8.73 512 1000 10.40 10.40 10.40 1024 1000 8.17 8.17 8.17 2048 1000 3.70 3.70 3.70 4096 1000 11.23 11.23 11.23 8192 1000 9.83 9.83 9.83 16384 1000 12.23 12.23 12.23 32768 1000 20.38 20.38 20.38 65536 640 287.42 287.55 287.48 131072 320 384.21 384.57 384.39 262144 160 500.82 501.32 501.07 524288 80 526.50 528.55 527.52 1048576 40 1370.35 1372.35 1371.35 2097152 20 2557.75 2561.35 2559.55 4194304 10 6391.41 6400.61 6396.01 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 16.97 16.97 16.97 2 1000 12.73 12.73 12.73 4 1000 12.70 12.70 12.70 8 1000 16.66 16.66 16.66 16 1000 10.36 10.36 10.36 32 1000 12.30 12.30 12.30 64 1000 11.66 11.66 11.66 128 1000 10.46 10.46 10.46 256 1000 12.22 12.22 12.22 512 1000 11.40 11.40 11.40 1024 1000 14.38 14.38 14.38 2048 1000 18.70 18.70 18.70 4096 1000 21.84 21.84 21.84 8192 1000 18.93 18.93 18.93 16384 1000 29.35 29.35 29.35 32768 1000 68.16 68.17 68.16 65536 640 1106.04 1106.64 1106.37 131072 320 1235.34 1236.16 1235.84 262144 160 1516.99 1518.68 1517.87 524288 80 1863.38 1865.56 1864.36 1048576 40 4728.15 4743.50 4735.84 2097152 20 9147.50 9189.90 9167.81 4194304 10 18003.89 18165.99 18078.77 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 19.72 19.72 19.72 2 1000 19.63 19.63 19.63 4 1000 19.82 19.82 19.82 8 1000 19.72 19.72 19.72 16 1000 19.58 19.58 19.58 32 1000 19.65 19.65 19.65 64 1000 19.95 19.95 19.95 128 1000 16.41 16.41 16.41 256 1000 21.02 21.02 21.02 512 1000 28.07 28.07 28.07 1024 1000 30.97 30.97 30.97 2048 1000 40.31 40.31 40.31 4096 1000 50.40 50.40 50.40 8192 1000 39.67 39.67 39.67 16384 1000 79.40 79.41 79.41 32768 1000 166.94 166.97 166.95 65536 640 2693.72 2694.87 2694.29 131072 320 3302.86 3304.86 3303.72 262144 160 4442.58 4445.36 4444.42 524288 80 6150.85 6156.80 6153.13 1048576 40 12180.23 12195.77 12190.96 2097152 20 23637.31 23713.35 23674.35 4194304 10 50652.60 51940.20 51385.12 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 56.02 56.02 56.02 2 1000 59.93 59.94 59.93 4 1000 60.10 60.11 60.10 8 1000 60.06 60.07 60.07 16 1000 59.86 59.86 59.86 32 1000 62.84 62.84 62.84 64 1000 59.95 59.95 59.95 128 1000 70.54 70.55 70.55 256 1000 70.10 70.11 70.10 512 1000 80.03 80.04 80.03 1024 1000 108.97 108.97 108.97 2048 1000 140.42 140.42 140.42 4096 1000 210.84 210.85 210.84 8192 1000 263.95 268.81 265.48 16384 1000 518.38 518.40 518.39 32768 1000 1006.70 1007.55 1006.93 65536 640 4095.73 4096.34 4096.05 131072 320 5847.90 5849.13 5848.49 262144 160 8684.54 8688.27 8686.41 524288 80 15665.08 15675.76 15670.54 1048576 40 33489.77 33514.17 33503.21 2097152 20 65189.19 65270.50 65248.24 4194304 10 129484.70 132306.91 130956.14 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 622.33 622.68 622.43 2 1000 756.62 757.13 756.81 4 1000 621.16 621.50 621.27 8 1000 826.17 826.70 826.41 16 1000 656.24 656.60 656.31 32 1000 451.81 452.00 451.85 64 1000 567.35 567.52 567.38 128 1000 797.30 798.32 797.69 256 1000 660.82 661.02 660.86 512 1000 698.68 699.10 698.85 1024 1000 1274.91 1275.54 1275.27 2048 1000 1336.77 1337.18 1337.01 4096 1000 1497.85 1498.44 1498.23 8192 1000 1287.60 1288.07 1287.80 16384 1000 2063.79 2064.32 2064.04 32768 1000 3507.38 3508.25 3507.76 65536 415 24255.68 24285.48 24270.41 131072 320 29958.98 29992.79 29975.81 262144 160 36966.62 37069.61 37013.00 524288 80 66601.37 66909.09 66745.76 1048576 40 124013.78 124869.13 124439.08 2097152 20 239791.44 243572.00 241706.36 4194304 10 433913.09 442829.59 438806.47 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 918.28 918.55 918.38 2 1000 921.60 921.79 921.69 4 1000 925.69 925.94 925.79 8 1000 945.37 945.70 945.46 16 1000 968.96 969.20 969.06 32 1000 362.82 363.20 362.97 64 1000 362.27 362.65 362.40 128 1000 1414.17 1414.31 1414.26 256 1000 2093.55 2093.71 2093.62 512 1000 622.26 622.87 622.56 1024 1000 4143.37 4145.41 4144.03 2048 1000 5287.22 5290.21 5288.24 4096 1000 6298.57 6302.86 6300.65 8192 1000 9690.06 9690.72 9690.47 16384 705 14245.47 14247.12 14246.49 32768 347 30542.74 30558.23 30548.93 65536 186 53145.25 53192.29 53164.66 131072 138 72167.33 72228.96 72196.90 262144 93 107259.39 107296.51 107273.18 524288 56 182413.23 182534.73 182474.44 1048576 29 351034.39 351368.83 351229.90 2097152 15 684714.14 685524.61 685171.89 4194304 8 1366053.88 1368453.62 1367528.85 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 1035.73 1036.37 1036.12 2 1000 1044.33 1044.73 1044.52 4 1000 1058.63 1058.97 1058.81 8 1000 1079.11 1079.54 1079.37 16 1000 1124.74 1124.99 1124.81 32 1000 476.66 477.18 476.92 64 1000 638.50 639.11 638.83 128 1000 2345.71 2346.05 2345.89 256 1000 2137.13 2138.02 2137.52 512 1000 5380.01 5383.92 5381.32 1024 1000 6842.58 6846.55 6844.32 2048 1000 9387.56 9393.15 9389.96 4096 596 16650.29 16663.67 16657.94 8192 195 51252.44 51257.39 51255.40 16384 90 110745.37 110764.58 110755.76 32768 90 99701.07 99806.63 99746.53 65536 74 135688.74 135834.05 135769.28 131072 49 205435.53 205660.06 205553.60 262144 32 320545.12 320811.34 320651.20 524288 13 747997.14 749446.06 748808.61 1048576 8 1344908.12 1353443.26 1349236.20 2097152 5 2334558.77 2346051.36 2340416.96 4194304 3 4435987.39 4451789.70 4444605.75 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.07 0.06 1 1000 1274.10 1274.65 1274.37 2 1000 1294.23 1294.90 1294.56 4 1000 1323.88 1324.35 1324.08 8 1000 1356.73 1357.20 1356.94 16 1000 1467.07 1467.52 1467.27 32 1000 996.99 997.99 997.61 64 1000 1870.68 1872.05 1871.50 128 1000 3095.79 3096.55 3096.05 256 1000 4697.42 4698.29 4697.73 512 810 10290.77 10300.58 10295.51 1024 666 14380.04 14391.34 14386.39 2048 404 24308.15 24338.73 24326.20 4096 204 48982.37 49066.98 49037.86 8192 76 132822.09 132933.34 132881.20 16384 34 295656.56 296496.97 296080.49 32768 27 370575.33 372406.15 371442.52 65536 26 398786.27 399657.69 399209.08 131072 19 521471.53 522685.36 522132.93 262144 12 782151.00 785018.25 783604.42 524288 7 1628015.14 1638334.72 1632950.80 1048576 4 3000877.50 3043890.00 3022018.13 2097152 2 5343951.58 5421655.54 5387245.62 4194304 out-of-mem.; needed X= 2.001 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.06 0.06 1 1000 597.71 598.32 597.96 2 1000 592.42 593.08 592.72 4 1000 605.02 605.61 605.32 8 1000 657.57 658.19 657.85 16 1000 903.16 903.85 903.53 32 1000 1549.88 1550.90 1550.42 64 1000 3089.51 3091.44 3090.59 128 607 15656.36 15673.00 15664.60 256 554 16597.86 16620.00 16608.20 512 507 19634.11 19661.97 19647.10 1024 380 26112.87 26155.29 26138.13 2048 248 39227.10 39300.21 39278.42 4096 128 78276.73 78535.94 78438.58 8192 64 157288.45 157318.06 157304.30 16384 31 323940.55 324136.64 324048.06 32768 14 749205.93 750365.36 749867.38 65536 12 865973.49 867344.42 866914.46 131072 7 1525395.12 1526962.55 1526216.58 262144 7 1270051.99 1272704.84 1271447.25 524288 5 2408539.82 2414948.03 2412222.16 1048576 3 4279239.02 4301399.71 4291310.43 2097152 out-of-mem.; needed X= 1.501 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) 4194304 out-of-mem.; needed X= 3.001 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.29 0.29 0.29 1 1000 1.17 1.17 1.17 2 1000 1.16 1.16 1.16 4 1000 2.97 2.97 2.97 8 1000 1.15 1.15 1.15 16 1000 3.15 3.15 3.15 32 1000 1.15 1.15 1.15 64 1000 3.26 3.26 3.26 128 1000 1.20 1.20 1.20 256 1000 2.68 2.68 2.68 512 1000 1.31 1.31 1.31 1024 1000 1.45 1.45 1.45 2048 1000 1.71 1.71 1.71 4096 1000 2.33 2.33 2.33 8192 1000 10.34 10.34 10.34 16384 1000 12.70 12.70 12.70 32768 1000 20.07 20.07 20.07 65536 640 297.57 297.76 297.67 131072 320 383.16 383.42 383.29 262144 160 489.97 490.48 490.22 524288 80 527.09 528.06 527.58 1048576 40 1347.97 1350.12 1349.05 2097152 20 2568.95 2572.55 2570.75 4194304 10 6210.11 6215.50 6212.81 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.33 1.29 0.57 1 1000 2.30 2.30 2.30 2 1000 2.33 2.33 2.33 4 1000 2.30 2.30 2.30 8 1000 2.31 2.31 2.31 16 1000 2.31 2.31 2.31 32 1000 2.30 2.30 2.30 64 1000 2.26 2.26 2.26 128 1000 2.42 2.42 2.42 256 1000 2.57 2.57 2.57 512 1000 2.82 2.82 2.82 1024 1000 9.73 9.73 9.73 2048 1000 4.13 4.13 4.13 4096 1000 16.08 16.08 16.08 8192 1000 18.95 18.95 18.95 16384 1000 32.19 32.20 32.20 32768 1000 68.36 68.37 68.36 65536 640 284.18 284.33 284.25 131072 320 470.88 472.44 471.77 262144 160 750.85 751.85 751.21 524288 80 1606.49 1612.25 1610.26 1048576 40 4919.60 4922.25 4920.94 2097152 20 10154.65 10158.16 10156.54 4194304 10 18670.89 18694.59 18683.19 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.40 0.42 0.41 1 1000 16.17 16.17 16.17 2 1000 11.03 11.04 11.04 4 1000 10.87 10.87 10.87 8 1000 11.62 11.62 11.62 16 1000 11.00 11.00 11.00 32 1000 11.71 11.71 11.71 64 1000 10.88 10.88 10.88 128 1000 11.85 11.85 11.85 256 1000 17.57 17.57 17.57 512 1000 11.88 11.88 11.88 1024 1000 14.99 14.99 14.99 2048 1000 19.03 19.03 19.03 4096 1000 28.57 28.58 28.58 8192 1000 39.82 39.83 39.83 16384 1000 79.52 79.53 79.52 32768 1000 149.92 149.94 149.93 65536 640 571.96 572.32 572.14 131072 320 1458.69 1460.30 1459.26 262144 160 3216.54 3221.01 3218.76 524288 80 7369.38 7424.84 7392.90 1048576 40 13546.80 13617.37 13579.25 2097152 20 26296.54 26594.90 26427.49 4194304 10 51879.22 52716.18 52455.14 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.56 0.60 0.58 1 1000 45.75 45.76 45.75 2 1000 40.89 40.89 40.89 4 1000 40.86 40.87 40.87 8 1000 41.57 41.58 41.58 16 1000 40.98 40.99 40.99 32 1000 40.83 40.83 40.83 64 1000 42.10 42.10 42.10 128 1000 55.12 55.37 55.33 256 1000 50.42 50.43 50.42 512 1000 60.73 60.74 60.74 1024 1000 74.20 74.21 74.20 2048 1000 98.00 98.01 98.00 4096 1000 157.04 157.06 157.05 8192 1000 263.66 263.67 263.67 16384 1000 521.59 521.62 521.61 32768 1000 1021.39 1021.48 1021.44 65536 640 2232.21 2233.28 2232.60 131072 320 4311.24 4316.08 4313.50 262144 160 8072.79 8098.91 8082.18 524288 80 17462.66 17553.24 17492.19 1048576 40 34433.10 34577.00 34489.12 2097152 20 66361.11 67027.51 66763.79 4194304 10 127788.59 129800.89 128730.00 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.88 0.94 0.91 1 1000 940.80 941.22 941.02 2 1000 956.16 956.44 956.32 4 1000 942.71 943.18 943.00 8 1000 953.28 953.56 953.41 16 1000 942.49 942.77 942.69 32 1000 936.72 937.13 936.93 64 1000 930.03 930.48 930.20 128 1000 932.20 932.63 932.46 256 1000 929.26 929.70 929.43 512 1000 939.85 940.34 940.06 1024 1000 961.17 961.53 961.32 2048 1000 967.62 967.93 967.83 4096 1000 1056.08 1056.43 1056.22 8192 1000 1258.06 1258.67 1258.41 16384 1000 2116.43 2116.86 2116.68 32768 1000 3518.47 3519.57 3519.00 65536 470 21810.50 21840.87 21831.06 131072 320 24316.80 24329.92 24323.88 262144 160 32437.31 32472.92 32457.37 524288 80 67386.44 67580.14 67486.74 1048576 40 132011.18 132564.37 132332.56 2097152 20 240976.25 243098.29 242229.86 4194304 10 452767.68 464519.38 459906.83 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 1.49 1.64 1.54 1 1000 1955.35 1955.81 1955.63 2 1000 1959.13 1959.56 1959.35 4 1000 1958.37 1958.84 1958.58 8 1000 1981.20 1981.52 1981.33 16 1000 1977.63 1978.01 1977.94 32 1000 1977.68 1978.15 1977.93 64 1000 1966.78 1967.21 1967.02 128 1000 2007.08 2007.52 2007.35 256 1000 2086.10 2086.58 2086.41 512 1000 2288.37 2288.82 2288.61 1024 1000 2789.38 2789.80 2789.61 2048 1000 4017.38 4017.84 4017.67 4096 1000 5688.34 5688.76 5688.65 8192 1000 8701.10 8701.76 8701.53 16384 737 13765.73 13768.26 13767.34 32768 219 44820.56 44843.86 44833.35 65536 179 56467.64 56480.44 56474.54 131072 139 71177.75 71201.52 71187.79 262144 93 109272.07 109344.17 109306.68 524288 58 174634.09 174761.53 174702.91 1048576 31 329846.33 330314.06 330082.47 2097152 16 644511.06 646390.81 645490.59 4194304 8 1272650.99 1279374.36 1276327.50 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 2.68 6.59 2.91 1 1000 4866.58 4867.38 4867.02 2 1000 4859.57 4860.19 4859.89 4 1000 4859.57 4860.47 4860.11 8 1000 5070.82 5071.56 5071.24 16 1000 5079.80 5080.71 5080.29 32 1000 5071.93 5072.45 5072.26 64 1000 5067.65 5068.65 5068.24 128 1000 5228.70 5229.43 5229.04 256 1000 6263.54 6264.54 6264.09 512 1000 7106.51 7107.08 7106.82 1024 1000 9073.60 9074.17 9073.87 2048 713 14007.23 14008.28 14007.71 4096 433 23116.84 23119.84 23118.41 8192 220 44804.51 44820.53 44814.33 16384 96 104173.64 104297.01 104267.18 32768 73 138789.81 138831.26 138807.07 65536 63 157000.51 157047.73 157025.82 131072 46 220512.39 220631.13 220578.30 262144 27 370630.37 371424.07 371071.33 524288 16 661352.38 663602.01 662621.86 1048576 9 1222117.11 1229343.12 1226014.63 2097152 5 2305182.60 2334607.03 2323198.97 4194304 3 4200635.67 4280373.33 4243381.78 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 5.08 12.45 5.32 1 831 12009.17 12010.38 12009.78 2 831 11999.89 12000.61 12000.29 4 831 12033.07 12034.13 12033.71 8 780 12628.08 12629.45 12628.96 16 780 12614.45 12615.57 12615.11 32 780 12614.92 12615.82 12615.42 64 780 12611.93 12613.10 12612.60 128 771 13056.06 13057.16 13056.62 256 575 17484.65 17486.16 17485.50 512 517 19451.91 19453.35 19452.56 1024 417 24090.48 24092.50 24091.51 2048 275 36422.49 36429.19 36426.46 4096 164 60631.85 60644.63 60641.32 8192 90 111487.99 111546.96 111528.12 16384 44 229678.43 230045.09 229916.72 32768 21 505857.24 506821.62 506482.37 65536 18 555311.55 556066.27 555746.29 131072 16 653890.31 655323.31 654700.12 262144 12 893369.91 895448.41 894496.50 524288 7 1528429.71 1536000.56 1532797.97 1048576 4 2865392.03 2888218.76 2879990.71 2097152 2 5311508.06 5400061.49 5363975.18 4194304 out-of-mem.; needed X= 2.001 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 6.28 17.83 7.36 1 516 19609.39 19610.89 19610.37 2 515 19576.30 19578.43 19577.43 4 515 19638.37 19640.16 19639.25 8 491 20398.27 20400.17 20399.32 16 491 20400.95 20403.22 20402.33 32 474 20365.12 20367.24 20366.29 64 474 20423.96 20426.58 20424.99 128 474 20945.35 20947.58 20946.88 256 329 29295.40 29299.31 29296.97 512 310 32442.45 32445.34 32444.27 1024 255 39436.89 39442.68 39440.33 2048 173 58054.43 58072.21 58064.64 4096 108 91641.04 91676.66 91663.39 8192 62 163672.47 163835.63 163766.24 16384 32 319777.53 320605.56 320204.98 32768 14 748866.85 751862.92 750786.23 65536 13 783368.92 787356.16 785824.46 131072 9 1130091.32 1134801.89 1133040.95 262144 8 1371865.12 1377670.88 1374844.38 524288 5 2341002.18 2357234.76 2350617.14 1048576 3 4388300.34 4443909.33 4420488.61 2097152 out-of-mem.; needed X= 1.501 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) 4194304 out-of-mem.; needed X= 3.001 GB; use flag "-mem X" or MAX_MEM_USAGE>=X (IMB_mem_info.h) #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.05 0.05 1 1000 0.42 0.42 0.42 2 1000 0.44 0.44 0.44 4 1000 0.42 0.42 0.42 8 1000 0.44 0.44 0.44 16 1000 0.43 0.43 0.43 32 1000 0.48 0.48 0.48 64 1000 0.48 0.48 0.48 128 1000 0.52 0.52 0.52 256 1000 0.56 0.56 0.56 512 1000 0.62 0.62 0.62 1024 1000 0.77 0.77 0.77 2048 1000 2.17 2.17 2.17 4096 1000 1.43 1.43 1.43 8192 1000 2.40 2.40 2.40 16384 1000 10.28 10.28 10.28 32768 1000 13.74 13.74 13.74 65536 640 231.17 231.30 231.23 131072 320 278.42 278.67 278.55 262144 160 313.39 313.90 313.64 524288 80 285.30 287.29 286.29 1048576 40 649.57 651.62 650.60 2097152 20 1104.75 1108.75 1106.75 4194304 10 2074.60 2081.39 2078.00 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.05 0.05 1 1000 0.70 0.70 0.70 2 1000 3.95 3.95 3.95 4 1000 0.71 0.71 0.71 8 1000 0.72 0.72 0.72 16 1000 0.73 0.73 0.73 32 1000 0.77 0.77 0.77 64 1000 0.78 0.78 0.78 128 1000 0.85 0.85 0.85 256 1000 0.92 0.92 0.92 512 1000 1.01 1.01 1.01 1024 1000 1.15 1.15 1.15 2048 1000 1.54 1.54 1.54 4096 1000 6.38 6.38 6.38 8192 1000 3.97 3.98 3.97 16384 1000 21.41 21.43 21.42 32768 1000 32.08 32.10 32.09 65536 640 200.91 201.16 201.03 131072 320 222.03 222.53 222.28 262144 160 303.16 303.93 303.55 524288 80 454.49 456.46 455.97 1048576 40 1207.32 1212.75 1210.13 2097152 20 2190.54 2207.90 2198.81 4194304 10 4375.00 4394.91 4384.83 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.05 0.05 1 1000 0.84 0.84 0.84 2 1000 0.84 0.84 0.84 4 1000 0.84 0.84 0.84 8 1000 0.85 0.85 0.85 16 1000 0.85 0.86 0.86 32 1000 1.11 1.11 1.11 64 1000 1.12 1.12 1.12 128 1000 1.23 1.23 1.23 256 1000 1.31 1.32 1.31 512 1000 5.11 5.12 5.11 1024 1000 1.64 1.64 1.64 2048 1000 1.90 1.91 1.91 4096 1000 3.06 3.07 3.07 8192 1000 14.56 14.57 14.56 16384 1000 20.38 20.62 20.54 32768 1000 26.84 26.94 26.91 65536 640 54.96 55.61 55.21 131072 320 89.29 90.13 89.83 262144 160 171.00 175.14 174.22 524288 80 337.81 356.45 348.12 1048576 40 812.75 862.25 832.35 2097152 20 2462.65 2513.71 2506.45 4194304 10 5272.41 5625.10 5522.61 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.05 0.05 1 1000 1.93 1.93 1.93 2 1000 1.92 1.92 1.92 4 1000 1.95 1.95 1.95 8 1000 1.96 1.96 1.96 16 1000 1.97 1.97 1.97 32 1000 13.76 13.78 13.77 64 1000 1.89 1.90 1.90 128 1000 2.05 2.06 2.05 256 1000 2.10 2.11 2.10 512 1000 2.43 2.44 2.44 1024 1000 2.87 2.89 2.88 2048 1000 10.18 10.18 10.18 4096 1000 22.44 22.65 22.51 8192 1000 24.48 24.64 24.57 16384 1000 31.23 31.30 31.27 32768 1000 48.12 48.37 48.18 65536 640 81.92 83.02 82.47 131072 320 151.81 153.31 152.56 262144 160 293.16 304.61 299.31 524288 80 574.13 627.34 612.95 1048576 40 1187.28 1280.85 1200.36 2097152 20 2757.80 3251.49 3161.09 4194304 10 5630.42 7012.10 6441.69 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.05 0.05 0.05 1 1000 7.24 7.51 7.35 2 1000 7.28 7.48 7.32 4 1000 7.32 7.58 7.43 8 1000 7.20 7.62 7.33 16 1000 7.16 7.72 7.29 32 1000 7.46 7.87 7.58 64 1000 7.37 7.94 7.50 128 1000 7.91 8.12 7.96 256 1000 9.34 9.63 9.52 512 1000 8.07 8.29 8.13 1024 1000 9.42 9.86 9.54 2048 1000 12.05 12.39 12.22 4096 1000 12.21 12.88 12.38 8192 1000 16.07 16.41 16.18 16384 1000 26.91 27.28 27.07 32768 1000 44.43 44.88 44.62 65536 640 84.00 85.60 85.31 131072 320 163.99 167.96 166.84 262144 160 332.92 342.72 340.22 524288 80 669.91 763.43 723.93 1048576 40 1646.02 1831.22 1795.36 2097152 20 2729.11 4071.50 3674.59 4194304 10 6228.02 7878.49 7076.08 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.04 0.05 0.05 1 1000 8.08 8.47 8.25 2 1000 7.21 7.80 7.45 4 1000 7.61 8.02 7.79 8 1000 8.55 9.39 8.94 16 1000 7.37 8.05 7.69 32 1000 7.40 8.17 7.79 64 1000 8.49 9.07 8.74 128 1000 8.25 8.75 8.46 256 1000 9.65 10.55 9.94 512 1000 11.15 11.91 11.48 1024 1000 9.80 10.45 10.03 2048 1000 11.05 11.50 11.31 4096 1000 13.24 14.18 13.63 8192 1000 18.47 19.33 18.85 16384 1000 32.23 33.27 32.77 32768 1000 54.43 55.75 55.07 65536 640 99.40 106.21 103.58 131072 320 190.53 207.37 201.85 262144 160 358.41 417.73 401.00 524288 80 661.71 847.75 781.10 1048576 40 1709.97 2055.62 1913.75 2097152 20 2515.90 4926.20 4502.91 4194304 10 6529.78 9914.40 8567.04 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.04 0.08 0.05 1 1000 10.66 12.45 11.41 2 1000 9.98 11.56 10.84 4 1000 10.68 11.63 11.28 8 1000 12.06 13.30 12.66 16 1000 11.11 12.38 11.70 32 1000 11.18 12.66 11.70 64 1000 11.62 13.15 12.21 128 1000 11.85 13.41 12.64 256 1000 12.19 13.83 12.86 512 1000 12.23 14.17 13.05 1024 1000 15.44 17.31 16.34 2048 1000 16.94 18.83 17.83 4096 1000 20.56 22.38 21.53 8192 1000 30.52 32.79 31.76 16384 1000 60.65 64.12 62.60 32768 1000 108.36 114.17 111.68 65536 640 182.77 202.76 198.00 131072 320 311.10 374.48 359.94 262144 160 502.86 676.94 624.51 524288 80 911.96 1243.25 1131.35 1048576 40 1807.25 3453.62 3004.94 2097152 20 2638.45 8959.09 6891.48 4194304 10 6423.71 17148.90 12865.30 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.04 0.08 0.05 1 1000 14.64 18.31 16.49 2 1000 13.70 17.25 15.50 4 1000 14.57 17.79 16.27 8 1000 15.51 18.74 17.13 16 1000 15.26 18.42 16.87 32 1000 15.81 19.12 17.48 64 1000 14.57 18.67 16.73 128 1000 16.58 20.45 18.48 256 1000 17.75 21.98 20.04 512 1000 19.23 22.79 21.14 1024 1000 21.51 25.36 23.55 2048 1000 24.40 29.31 26.95 4096 1000 31.42 37.08 34.55 8192 1000 49.68 57.51 54.15 16384 1000 93.74 103.72 99.41 32768 1000 152.80 171.81 163.43 65536 640 239.52 288.15 270.57 131072 320 324.85 463.95 417.18 262144 160 540.21 829.64 708.38 524288 80 1008.76 1721.21 1462.63 1048576 40 1729.35 4892.45 3843.08 2097152 20 2417.70 11052.25 8878.33 4194304 10 6665.52 22143.60 17128.85 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 384 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.04 0.08 0.04 1 1000 16.20 21.68 19.03 2 1000 15.97 21.02 18.55 4 1000 16.19 21.49 19.04 8 1000 16.28 21.53 18.90 16 1000 17.00 22.51 19.70 32 1000 16.86 22.10 19.51 64 1000 16.12 21.49 18.95 128 1000 18.55 24.16 21.38 256 1000 19.35 25.42 22.35 512 1000 21.64 27.49 24.65 1024 1000 22.95 29.51 26.47 2048 1000 27.86 35.11 31.81 4096 1000 34.31 43.75 39.64 8192 1000 58.95 71.96 67.06 16384 1000 89.88 103.71 97.09 32768 1000 144.30 167.01 156.29 65536 640 218.14 269.99 246.91 131072 320 321.78 509.35 437.26 262144 160 509.18 848.36 718.09 524288 80 937.14 1944.57 1629.89 1048576 40 1640.12 5363.33 4457.50 2097152 20 2469.90 11684.45 9636.01 4194304 10 6795.50 22665.60 18732.61 #--------------------------------------------------- # Benchmarking Barrier # #processes = 2 # ( 382 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 0.57 0.57 0.57 #--------------------------------------------------- # Benchmarking Barrier # #processes = 4 # ( 380 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 3.07 3.07 3.07 #--------------------------------------------------- # Benchmarking Barrier # #processes = 8 # ( 376 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 1.44 1.44 1.44 #--------------------------------------------------- # Benchmarking Barrier # #processes = 16 # ( 368 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 10.44 10.44 10.44 #--------------------------------------------------- # Benchmarking Barrier # #processes = 32 # ( 352 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 175.85 176.07 175.92 #--------------------------------------------------- # Benchmarking Barrier # #processes = 64 # ( 320 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 188.77 189.11 188.86 #--------------------------------------------------- # Benchmarking Barrier # #processes = 128 # ( 256 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 416.43 416.66 416.53 #--------------------------------------------------- # Benchmarking Barrier # #processes = 256 # ( 128 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 475.82 476.15 476.03 #--------------------------------------------------- # Benchmarking Barrier # #processes = 384 #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 489.86 490.20 490.03 # All processes entering MPI_Finalize