Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Fruchtl__Herbert
Beginner
90 Views

MPI problem: Assertion failed in dapl_module_util.c

Hello,

This question has been asked before, but I don't think the poster got an answer. I am trying to compile a user's code. It runs for a while, but then stops with the error message shown in the subject. A bit more detail:

[71:wardlaw187] unexpected disconnect completion event from [0:wardlaw182]
Assertion failed in file ../../dapl_module_util.c at line 2682: 0
internal ABORT - process 71

This is impi 4.0.1, with ifort 12.0.0 20101006, on RHEL/CentOS (the login/compile node is RHEL 5.3, the compute nodes use a slightly modified (by Bull, the vendor) CentOS 5.3). The hardware is a Westmere cluster with Infiniband interconnect.

Any ideas?

Thanks in advance,

Herbert


0 Kudos
3 Replies
Dmitry_K_Intel2
Employee
90 Views

Hi Herbert,

It looks like this is rather application issue.
Probably one process dies and and other processes cannot get completion event. Could you please set I_MPI_DEBUG_COREDUMP=1 and start you application (much better if it was compiled with '-g'). It should generate a core file. Take a look at the backtrace and let me know if you think that this is Intel MPI issue.

Regards!
Dmitry


L_4
Beginner
90 Views

Hi Herbert,

It looks like this is rather application issue.
Probably one process dies and and other processes cannot get completion event. Could you please set I_MPI_DEBUG_COREDUMP=1 and start you application (much better if it was compiled with '-g'). It should generate a core file. Take a look at the backtrace and let me know if you think that this is Intel MPI issue.

Regards!
Dmitry


Dear Dmitry,

I came across the same problem. I tried to set the I_MPI_DEBUG_COREDUMP=1 but no core file dumped so that I could not know the detail of the abortion. The application I run was mpiBLAST, a very famous HPC application of biology.

Can you show more hints? Thank you!
Dmitry_K_Intel2
Employee
90 Views


Could you please submit a ticket on the premier.intel.com?
I'm not sure that we should investigate the issue on this forum.

Regards!
Dmitry

Reply