Software Archive
Read-only legacy content
17061 Discussions

Configuration and Benchmarks of Peer-to-Peer Communication over Gigabit Ethernet and InfiniBand in a Cluster with Intel Xeon Phi

Vadim_Karpusenko
Beginner
878 Views

Hi all,

We just have published our new white paper about point-to-point MPI communication between Xeon Phi Coprocessors in the cluster:

http://research.colfaxinternational.com/post/2014/03/11/InfiniBand-for-MIC.aspx

<!--break-->

Abstract is below:

Intel Xeon Phi coprocessors allow symmetric heterogeneous clustering models, in which MPI processes are run fully on coprocessors, as opposed to offload-based clustering. These symmetric models are attractive, because they allow effortless porting of CPU-based applications to clusters with manycore computing accelerators.

However, with the default software configuration and without specialized networking hardware, peer-to-peer communication between coprocessors in a cluster is quenched by orders of magnitude compared to the capabilities of Gigabit Ethernet networking hardware. This situation is remedied by InfiniBand interconnects and the software supporting them.

In this paper we demonstrate the procedures for configuring a cluster with Intel Xeon Phi coprocessors connected with Gigabit Ethernet as well as InfiniBand interconnects. We measure and discuss the latencies and bandwidths of MPI messages with and without the advanced configuration with InfiniBand support. The paper contains a discussion of MPI application tuning in an InfiniBand-enabled cluster with Intel Xeon Phi Coprocessors, a case study of the impact of InfiniBand protocol, and a set of recommendations for accommodating the non-uniform RDMA performance across the PCIe bus in high performance computing applications.

0 Kudos
6 Replies
jimdempseyatthecove
Honored Contributor III
877 Views

I read the white paper, nice work. I especially like the Figure 9 graphic.

Do you think there is room on the right node of Figure 9 to squeeze in, in different color, the shm latencies and bandwidth?

Jim Dempsey

0 Kudos
Vadim_Karpusenko
Beginner
877 Views

Jim, thank you for your kind words and the suggestion.

We discussed your suggestion and decided that adding more information on Figure 9 probably will overcrowd it, and will make it very confusing. Especially since Figure 8 already contains the latency and bandwidth comparison between dapl and shm fabrics.

Thank you!

0 Kudos
jimdempseyatthecove
Honored Contributor III
878 Views

For connectivity overview, I think you could leave Figure 9 as-is, but add somewhere later on a more complete figure with not only the shm added, but also add tcp. If need be you can add another node.

Figure 4 could also have a companion graphic showing the pathway to the second socket supported MIC.

These graphics, are very clear and self explanatory. Congratulations to whomever drew them.

These additional graphics need not accompany this article, rather, they could be incorporated into a connectivity white paper. I think this would be helpful in deciding the cost/benefits of various interconnection schemes.

Jim Dempsey

0 Kudos
Vadim_Karpusenko
Beginner
878 Views

Thank you Jim,

I'm doubly flattered by your kind words, cause I'm the one who made those plots.

I like your idea. And I will definitely create combined plot with tcp, shm, and dapl fabrics all together. It should fit nicely into my MIC Developer Training slides. Thank you!

0 Kudos
Filip_S_1
Beginner
877 Views

Dear Vadim,

I wanted to look at the paper, but at the page you linked only the "Colfax_File_IO_on_Intel_Xeon_Phi_Coprocessors1.pdf (2 MB)" is available, which about the File I/O.

Is the paper about MPI communication available? I would like to compare your results with my own.

Thanks! Filip

0 Kudos
Vadim_K_1
Beginner
877 Views

Filip S. wrote:

I wanted to look at the paper, but at the page you linked only the "Colfax_File_IO_on_Intel_Xeon_Phi_Coprocessors1.pdf (2 MB)" is available, which about the File I/O.

Dear Pilip,

Sorry about the inconvenience, during the migration process to another platform this link got messed up. I've just fixed it. So, it should be working now.

0 Kudos
Reply