- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I use MPI (with Infiniband's RDMA enabled) over mmaped region. The size of mmap region is larger than the physical memory size, so I expect TLB is updated often, which may incur 'undefined' behavior of Infiniband's RDMA. The application kills the kernel (kernel panic), which is absolutely not acceptible (User application code never incur it)
I suspect Infiniband's RDMA capability which bypasses translation buffer of CPU and accesses physical memory directly as one of the reasons for corrupting OS 'somehow'.
1. Can RDMA capability occur such a problem I described?
2. I tried with 'I_MPI_DAPL_TRANSLATION_CACHE' disabled, but the issue is not resolved. (I don't see any message saying 'I_MPI_DAPL_TRANSALCTION_CACHE=0' even with 'I_MPI_DEBUG=100', am I miss-using the env. vars?)
many thanks for all :D
Link Copied

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page