Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Intel MPI issue

llodds
Beginner
243 Views

I was running a MPI program on Skylake nodes. The program is able to finish on only one node (np=2, ph=2) and reports no error. However, if I run it using two Skylake nodes (np=2, ph=1), I would get the following error: 

rank = 1, revents = 8, state = 8
Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 2988: (it_plfd->revents & POLLERR) == 0
internal ABORT - process 0

Weird thing is, all my colleagues using csh can finish the run without reporting this failed assertion, and who are using bash (including me) always get this assertion issue failed at the same line (2988). What could be the potential causes for this type of error? 

 

0 Kudos
1 Reply
PrasanthD_intel
Moderator
243 Views

Duplicate thread. Please refer to this thread instead: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/851861

Reply