Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
2022 Discussions

MPI over mmaped region kills Kernel?



I use MPI (with Infiniband's RDMA enabled) over mmaped region. The size of mmap region is larger than the physical memory size, so I expect TLB is updated often, which may incur 'undefined' behavior of Infiniband's RDMA. The application kills the kernel (kernel panic), which is absolutely not acceptible (User application code never incur it)

I suspect Infiniband's RDMA capability which bypasses translation buffer of CPU and accesses physical memory directly as one of the reasons for corrupting OS 'somehow'.

 1. Can RDMA capability occur such a problem I described?

 2. I tried with 'I_MPI_DAPL_TRANSLATION_CACHE' disabled, but the issue is not resolved. (I don't see any message saying 'I_MPI_DAPL_TRANSALCTION_CACHE=0' even with 'I_MPI_DEBUG=100', am I miss-using the env. vars?)

many thanks for all :D

0 Kudos
0 Replies