I seek confirmation that I am doing stuff properly. Here my situation. The new cluster in my institution has two Mellanox Connect-IB cards on each node. Each node is a dual socket six-core Ivy Bridge. The node architecture is such that each socket is connected with a straight PCIe lane to each IB card. What I want to do is basically assign a subset of the MPI processes (e.g. the first 6) to first IB card and the other MPI processes to the second IB card. No rail sharing, for both small and large messages a MPI should use one single (assigned) IB card.