- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I executed MPI program. I tried NPB 3.4.2 (Nas Parallel Benchmark) with 8 nodes.
The 4 nodes are using 2nd Gen Xeon XP with connectX-5 NIC and the other 4 nodes are using 3rd Gen Xeon XP with ConnectX-6 NIC. If I used I_MPI_OFI_PROVIDER=tcp setting, the execution was continued and can show traffic between nodes but it did not finish the benchmark. If I used I_MPI_OFI_PROVIDER=mlx setting, the execution was stopped soon even it does not finish intial initialization. However, if I used I_MPI_PLATFROM=auto or I_MPI_PLATFORM=cxl, the execution was continued and finished without problem. It was not depend on I_MPI_OFI_PROVIDER setting.
And also if I did not mix the processor and NIC. (used only the 4 nodes which are using 2nd Gen Xeon SP and ConnectX-5 or used only the other 4 nodes which are using 3rd Gen Xeon SP and ConnectX-6), the execution was finished without any problem.
Could you tell me I_MPI_PLATFORM setting effects to such a processor and NIC mixing environment?
Regard, K. Kunita
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting in the Intel forums.
Could you please let us know whether you are facing a similar issue with the IMB Sendrecv benchmark?
Thanks & Regards
Shivani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I observed same thing with IMB sendrecv. Svr0-100g and svr1-100g are using 2nd Gen Xeon SP and ConnectX-5 NIC, Svr4-100g and Svr5-100g is using 3rd Gen Xeon SP and ConnectX-6 NIC. With this, I executed as Case 1: "mpirun -n 2 -ppn 1 -host svr0-100g,svr1-100g ./IMB-MPI1 sendrecv", Case2: "mpirun -n 2 -ppn 1 -host svr0-100g,svr4-100g ./MPB-MPI1 sendrecv", Case3: "mpirun -n 2 -ppn 1 -host svr4-100g,svr5-100g ./IMB-MPI1 sendrecv".
In case of the Case1 and Case3, the execution is continue and finish. However, in the Case 2, the execution is stop after show the start of sendrecv and could not finished. In the Case 2, if I added option "-genv I_MPI_PLATFORM clx", the execution is continue and finish successfully.
Regards, K. Kunita
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The correct usage of Intel MPI in a heterogeneous environment is to set I_MPI_PLATFORM to either AUTO or the lowest common supported platform.
In the manual it is explicitly stated that one should set AUTO:
auto: Use only with heterogeneous runs to determine the appropriate platform across all nodes. This may slow down MPI initialization time due to collective operation across all nodes.
For more details please refer to the below link
It is not guaranteed to work without setting this variable.
Thanks & Regards
Shivani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, I see. Please close this.
Thanks, Kuni
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the confirmation. If you need any additional information, please post a new question as this will no longer be monitored by Intel.
Thanks & Regards
Shivani
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page