- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Running Intel's IMPI benchmark (mpi ver 4.1.0.024) i've got some strange results.
mpirun -genv I_MPI_FABRICS=shm:dapl -np 2 -ppn 1 -hosts mic0,mic1 ./IMB-MPI1 PingPong
36 us lattency for 0 bytes messages , max 868 Mbytes/sec for 4MB messages.
using tcp instead of dapl (i have external bridge config for mic's ethernet ports with MTU of 1500):
mpirun -genv I_MPI_FABRICS=shm:tcp -np 2 -ppn 1 -hosts mic0,mic1 ./IMB-MPI1 PingPong
496 us lattency for 0 Bytes and 16 MBytes/sec max throughput for 4MB messages!!!
I've expected much better numbers (especially for tcp) - anyone with an idea what's wrong ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I met the same problem when I tested the bandwidth between MIC & host via Unix TCP Socket directly. It is 18MB/s.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Current releases of Intel MPI should improve DAPL performance over such older ones; latest mvapich with the mic to mic communication over host QPI may be better yet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see the same problem which cripples NFS performance:
http://software.intel.com/en-us/forums/topic/404743#comment-1746053
About to write a custom library for file access over SCIF.. But if I had time the right way is to fix the network driver or write an ethernet over SCIF driver.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Those latencies are too high and bandwidth too low.
I get DAPL latency close to 10us, bandwidth greater than 1300MB/s.
For TCP, latency is around 300us, bandwidth close to 80MB/s.
I'm using Intel(R) MPI 4.1.1.036.
See this article for cluster configuration tips: http://software.intel.com/en-us/articles/configuring-intel-xeon-phi-coprocessors-inside-a-cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
switched to Intel(R) MPI 4.1.1.036 and latest MPSS - got slightly better results for dapl (16-20 usec and 885MB/s), but tcp still very slow.
Using dd to benchmark read/write speed from/to nfs share (filer known to be able to stream > 800MB/s) gives 20/21 MB/s.
IMO something's wrong with virtual nics and/or ip stack implementation in MPSS, or MIC's cores are simply not powerfull enough to handle more ip traffic.
IMB PingPong between two machines hosting MIC coprocessors using plane 1 Gbit Ethernet (i350) connection results in min latency of 50 usec and max bandwidth of 112MB/s (as exepcted, 1Gbit limit). That's roughly 10x quicker then between 2 MIC cards connected through PCIe and MPSS's virtual network stack.
Gregg, what MTU size do you use in your environment ? I've double-checked my config, but can't go beyond 16MB/s using TCP.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The article links to configuration notes directly from the administrator who set up the cluster whose latencies and bandwidth I quoted. It's good, first-hand information. From the notes, "The MTU in this network is generally set to 9000. Please adapt this to your settings."
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I read (I see the same problem which cripples NFS performance)
About the share mounted
I don't know if it's possible for you with Phi to compile CIFS (Samba) sometimes he
will give better result that NFS
Advantage the last version is able to read write in differed.
you have also more parameter options in his smb.conf for solve
when you discover the weird performances.
Problem of Samba it's little complex with an number options gigantic..
Personally i use always with fiber or with the copper also with wireless
and i am very satisfied of him.
I don't know but i have the doubts that MTU corrected will create the miracle
in your case precise.
Regards

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page