Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
1919 Discussions

Problem with the use of Intel MPI DAPL fabric

dingjun_chencmgl_ca
189 Views

 

 

 

Hi, on our Windows PCs cluster, when I tried to test Intel MPI benchmark 4.0 with the use of DAPL fabric, the following error always occurred. Could you tell me the reasons? Thanks in advance.

By the way, both   WinOFED 3.2 and Mellanox WinOF Rev 4.4 are installed on our Windows PCs cluster.


C:\Users\dingjun\mpi5tests>mpiexec -n 4 -env I_MPI_FABRICS shm:dapl IMB-MPI1
dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.


job aborted:
rank: node: exit code[: error message]
0: drmswc4-1.cgy.cmgl.ca: 291: process 0 exited without calling finalize
1: drmswc4-1.cgy.cmgl.ca: 291: process 1 exited without calling finalize
2: drmswc4-1.cgy.cmgl.ca: 291: process 2 exited without calling finalize
3: drmswc4-1.cgy.cmgl.ca: 291: process 3 exited without calling finalize

0 Kudos
5 Replies
James_T_Intel
Moderator
189 Views

This seems like an InfiniBand* problem.  Are you able to run any non-MPI programs using IB?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

dingjun_chencmgl_ca
189 Views

Thanks to James. The InfiniBand works well except the use of DAPL fabric.

 

Vijay_Amirtharaj
Beginner
189 Views

Hi,

we also faced same issue. we changed file permission of /dev/infiniband folder files in all nodes.

Like this

chmod 666 /dev/infiniband/*

then started working. please try. if works great :)

Regards,

Vijay Amirtharaj A

dingjun_chencmgl_ca
189 Views

Our PCs cluster is Windows OS rather than Linux. How to hand it in this case? I look forward to your reply.

James_T_Intel
Moderator
189 Views

If your HCA doesn't support DAPL* (as you stated in another thread), then you'll need to use a different HCA.

Reply