- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On Lomonosov-2 supercomputer (http://hpc.msu.ru/node/159, partition "Test"), with IMPI2019u9 I've got a significant difference in IMPI-MPI1 results between release and release_mt library kinds. I wonder if this is an expected difference level (roughly 2x difference) or something must be tuned to get better figures on release_mt?
On 2 nodes with 14 cores each, verbs provider, I see on PingPong test:
# mpiexec.hydra -np 28 -ppn 14 IMB-MPI1 pingpong -multi 0 -map 14x2 -npmin 28
release:
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 0.99 1.85 1.19 0.00
1 1000 1.00 2.01 1.36 0.50
2 1000 1.01 2.03 1.39 0.98
4 1000 1.03 2.07 1.44 1.93
8 1000 1.02 2.09 1.44 3.82
16 1000 0.99 2.09 1.43 7.65
32 1000 1.00 2.11 1.44 15.19
64 1000 1.07 2.17 1.49 29.55
128 1000 1.09 2.25 1.55 56.99
256 1000 1.58 2.79 2.11 91.81
512 1000 1.69 2.96 2.26 172.76
1024 1000 1.93 3.16 2.48 324.01
2048 1000 2.40 3.77 3.06 543.28
4096 1000 3.73 5.31 4.62 771.92
8192 1000 7.05 9.27 8.17 883.79
16384 1000 14.28 15.84 15.02 1034.12
32768 1000 26.20 28.64 28.23 1144.21
65536 640 52.23 56.97 56.20 1150.41
131072 320 106.09 113.83 112.49 1151.49
262144 160 198.19 228.21 222.12 1148.68
524288 80 414.78 452.38 445.68 1158.94
1048576 40 852.25 910.42 895.15 1151.75
2097152 20 1674.94 1800.72 1771.30 1164.62
4194304 10 3328.46 3580.92 3500.30 1171.29
release_mt:
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 1.17 1.18 1.18 0.00
1 1000 1.04 1.05 1.05 1.91
2 1000 1.06 1.06 1.06 3.76
4 1000 1.05 1.06 1.06 7.55
8 1000 1.05 1.06 1.06 15.08
16 1000 1.07 1.08 1.07 29.74
32 1000 1.09 1.09 1.09 58.51
64 1000 1.09 1.11 1.10 115.46
128 1000 1.09 1.10 1.10 232.43
256 1000 1.16 1.17 1.17 437.35
512 1000 1.60 1.61 1.60 637.11
1024 1000 1.96 1.97 1.96 1039.45
2048 1000 2.33 2.35 2.34 1746.43
4096 1000 3.26 3.28 3.27 2499.51
8192 1000 9.00 9.10 9.06 1801.16
16384 1000 14.64 14.73 14.68 2224.33
32768 1000 18.79 18.89 18.84 3469.87
65536 640 33.53 33.82 33.70 3875.75
131072 320 46.41 47.26 46.91 5546.99
262144 160 170.80 190.11 181.74 2757.88
524288 80 298.61 342.64 324.36 3060.27
1048576 40 517.24 593.49 561.51 3533.58
2097152 20 1783.70 1903.42 1855.13 2203.56
4194304 10 2774.07 3524.14 3203.87 2380.33
On SendRecv test:
# mpiexec.hydra -np 28 -ppn 14 IMB-MPI1 sendrecv -multi 0 -map 14x2 -npmin 28
release:
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 1.49 1.49 1.49 0.00
1 1000 1.58 1.59 1.59 1.26
2 1000 1.62 1.63 1.62 2.46
4 1000 1.66 1.67 1.66 4.80
8 1000 1.67 1.68 1.67 9.55
16 1000 1.67 1.67 1.67 19.14
32 1000 1.68 1.68 1.68 38.03
64 1000 1.72 1.73 1.72 74.09
128 1000 1.77 1.78 1.77 143.92
256 1000 2.34 2.34 2.34 218.43
512 1000 2.51 2.52 2.51 406.40
1024 1000 2.86 2.87 2.86 714.63
2048 1000 3.65 3.66 3.66 1118.13
4096 1000 8.22 8.25 8.24 992.80
8192 1000 19.13 19.24 19.20 851.43
16384 1000 33.33 33.51 33.45 977.78
32768 1000 57.93 58.25 58.16 1125.04
65536 640 115.23 115.76 115.58 1132.32
131072 320 228.99 229.29 229.16 1143.31
262144 160 456.14 462.42 459.60 1133.80
524288 80 888.48 916.33 907.38 1144.32
1048576 40 1791.33 1836.76 1819.39 1141.77
2097152 20 3621.92 3685.21 3663.56 1138.14
4194304 10 7400.42 7462.25 7447.10 1124.14
release_mt:
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 0.97 0.98 0.97 0.00
1 1000 1.01 1.02 1.02 1.96
2 1000 1.06 1.06 1.06 3.76
4 1000 1.05 1.06 1.06 7.54
8 1000 1.05 1.06 1.06 15.10
16 1000 1.05 1.06 1.05 30.25
32 1000 1.09 1.10 1.10 58.13
64 1000 1.09 1.10 1.10 116.52
128 1000 1.11 1.12 1.12 228.76
256 1000 1.13 1.13 1.13 451.59
512 1000 1.61 1.62 1.61 633.24
1024 1000 1.93 1.94 1.94 1054.86
2048 1000 2.33 2.35 2.34 1744.99
4096 1000 3.24 3.26 3.25 2514.90
8192 1000 8.87 8.99 8.93 1823.41
16384 1000 14.05 14.13 14.08 2319.07
32768 1000 18.73 18.83 18.77 3480.01
65536 640 33.52 33.98 33.74 3857.27
131072 320 76.41 77.55 76.98 3380.47
262144 160 167.88 185.72 177.44 2823.00
524288 80 279.85 333.56 307.80 3143.58
1048576 40 512.18 604.98 555.47 3466.50
2097152 20 1660.85 1872.98 1807.15 2239.38
4194304 10 2820.28 3385.53 3197.91 2477.78
On Allreduce test:
# mpiexec.hydra -np 28 -ppn 14 IMB-MPI1 allreduce -npmin 28
release:
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.04 0.09 0.04
4 1000 2.54 3.10 2.85
8 1000 2.77 3.06 2.94
16 1000 2.29 3.03 2.69
32 1000 2.38 3.04 2.74
64 1000 2.94 3.83 3.12
128 1000 3.14 4.16 3.49
256 1000 3.73 5.41 4.45
512 1000 3.45 5.44 4.34
1024 1000 4.91 6.64 5.63
2048 1000 6.89 9.04 7.82
4096 1000 9.65 13.11 11.24
8192 1000 16.25 20.12 18.07
16384 1000 30.61 44.25 34.31
32768 1000 41.53 65.61 49.35
65536 640 68.45 101.01 82.05
131072 320 125.73 171.21 146.80
262144 160 255.96 322.65 289.61
524288 80 579.32 754.60 654.68
1048576 40 1296.14 1642.46 1473.95
2097152 20 3037.64 3842.48 3528.60
4194304 10 7157.87 8204.61 7885.93
release_mt:
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.04 0.09 0.05
4 1000 2.42 3.39 2.91
8 1000 2.50 4.17 3.49
16 1000 2.81 4.45 3.84
32 1000 2.87 4.43 3.88
64 1000 2.90 5.01 4.12
128 1000 4.24 5.65 4.94
256 1000 4.93 7.18 6.03
512 1000 4.10 6.96 5.67
1024 1000 5.40 8.31 7.00
2048 1000 8.27 11.40 9.98
4096 1000 12.66 17.31 15.15
8192 1000 37.85 52.78 44.80
16384 1000 64.47 85.19 77.01
32768 1000 104.25 133.76 126.37
65536 640 203.90 249.45 238.23
131072 320 398.31 485.98 466.45
262144 160 798.27 1023.07 959.00
524288 80 1656.68 2016.64 1929.01
1048576 40 3252.24 4003.58 3813.06
2097152 20 6578.58 8080.62 7716.11
4194304 10 14086.20 17127.67 16398.42
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(sorry, I mixed up release/release_mt headers in SendRecv and PingPong datasets)
--
Regards,
Alexey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(one more mistake: for these PingPong and SendRecv runs, correct IMB-MPI1 map arg is: "-map 2x14". So this is intra-node communication, not a cross-node one).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexey,
We too have observed similar differences in the time taken by both versions.
We will investigate further and get back to you.
Regards
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexey,
After contacting the internal team. I got the following response
"We shouldn't compare performances of release and release_mt. Release_mt is only for the advanced users who want to test and take advantage of the latest features and is currently not intended for public usage(general public)"
Also, the benchmarks for which you are testing don't have any release_mt features.
We believe this answer your question. and let us know if you have any other queries. Else, we can close this thread.
Regards
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Prasanth,
>> We believe this answer your question
well ok, the orginal question was: "I wonder if this is an expected difference level (roughly 2x difference) or something must be tuned to get better figures on release_mt?"
Shall I take this as an answer: "this IS an expected difference level"?
A few more questions arise:
Could you point where in the release documentation it is stated that "Release_mt is only for the advanced users" and "is currently not intended for public usage"? I searched text documents in the installation package, "developer guide", "release notes" and "known issues" on the Web, but failed to find anything like this.
Then, in my humble opinion, I AM an advanced user, moreover we've paid for 1 year support. Am I allowed to use release_mt and have some appropriate support, or my case is still a case of "public usage", which is not intended?
Since I_MPI_ASYNC_PROGRESS=1 is allowed only with release_mt library kind, does it mean that asynchronous progress feature is also "currently not intended for public usage"? If so, when this feature switched from a normal feature (as it used to be in IntelMPI 2017, 2018) to this state? Was it declared in the release notes?
Since lack of documentation on this topic, could you please inform me, if I_MPI_ASYNC_PROGRESS && release_mt are still "not intended for public usage" in IMPI 2021beta?
>> Also, the benchmarks for which you are testing don't have any release_mt features.
I'd like to comment on this: any typical large-scale HPC application uses both blocking and non-blocking communications of various kinds. So, mixed micro-benchmarking of blocking MPI interfaces and non-blocking ones, with and without communication-computation overlap is always appropriate.
--
Regards,
Alexey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexey,
Sorry if there has been a miscommunication when I mentioned "not for public usage" what I mean is that the release_mt is for advanced users and not for general users.
==>Am I allowed to use release_mt and have some appropriate support, or my case is still a case of "public usage", which is not intended?
Yes, there is support for release_mt.
As you have said that "I AM an advanced user, moreover we've paid for 1-year support." You can raise a ticket at https://software.intel.com/content/www/us/en/develop/support/priority-support.html and receive immediate support.
Also, I am escalating this thread to the internal team for better support.
Regards
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Prasanth,
OK I think it is clear that in IMPI 2019 "release_mt" kind may be slower than "release" in some cases, and it is OK. Thing which is not quite clear: what is the status of I_MPI_ASYNC_PROGRESS feature, which is strongly tied to "release_mt": is it fully supported, or preview, or experimental, or recommended for limited usage scenarios, or is intended for a limited subset of product users or it is some other status? Is this status changing in IMPI 2021? I have a feeling that this topic should be clarified, this don't seem to be accuratelly and explicitely defined in the documentation.
I will also submit the question on this topic via https://software.intel.com/content/www/us/en/develop/support/priority-support.html a bit later.
Thanks for help!
--
Regards,
Alexey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page