Community
cancel
Showing results for 
Search instead for 
Did you mean: 
84 Views

Problem with the use of Intel MPI DAPL fabric

 

 

 

Hi, on our Windows PCs cluster, when I tried to test Intel MPI benchmark 4.0 with the use of DAPL fabric, the following error always occurred. Could you tell me the reasons? Thanks in advance.

By the way, both   WinOFED 3.2 and Mellanox WinOF Rev 4.4 are installed on our Windows PCs cluster.


C:\Users\dingjun\mpi5tests>mpiexec -n 4 -env I_MPI_FABRICS shm:dapl IMB-MPI1
dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.


job aborted:
rank: node: exit code[: error message]
0: drmswc4-1.cgy.cmgl.ca: 291: process 0 exited without calling finalize
1: drmswc4-1.cgy.cmgl.ca: 291: process 1 exited without calling finalize
2: drmswc4-1.cgy.cmgl.ca: 291: process 2 exited without calling finalize
3: drmswc4-1.cgy.cmgl.ca: 291: process 3 exited without calling finalize

0 Kudos
5 Replies
James_T_Intel
Moderator
84 Views

This seems like an InfiniBand* problem.  Are you able to run any non-MPI programs using IB?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

84 Views

Thanks to James. The InfiniBand works well except the use of DAPL fabric.

 

Vijay_Amirtharaj
Beginner
84 Views

Hi,

we also faced same issue. we changed file permission of /dev/infiniband folder files in all nodes.

Like this

chmod 666 /dev/infiniband/*

then started working. please try. if works great :)

Regards,

Vijay Amirtharaj A

84 Views

Our PCs cluster is Windows OS rather than Linux. How to hand it in this case? I look forward to your reply.

James_T_Intel
Moderator
84 Views

If your HCA doesn't support DAPL* (as you stated in another thread), then you'll need to use a different HCA.

Reply