- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
we have an application with basically two (last) sequence of actions in the domain decomposition:
one set of tasks (subset a) calls
call mpi_win_lock(some_rank_from_subset_b) call mpi_win_get(some_rank_from_subset_b) call mpi_win_unlock(some_rank_from_subset_b)
the others (subset b) are stuck in the MPI_Barrier at the end of the domain decomposition. This performs nicely (passes domain decomposition within seconds) with MVAPICH on our new Intel Xeon machine and on another machine with IBM BlueGene/Q hardware.
Unfortunately with Intel MPI on the same machine that breeezes through with MVAPICH, we get significantly less performance out of the same code: it hangs at this stage of domain decomposition for approximately 10 minutes. The hardware in question is equipped with Mellanox ConnectX-3 IB HCAs. We run RHEL6.
How can I improve passive target performance?
Regards, Thomas
Link Copied

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page