Intel® Edge Software Hub
Get answers from community peers to your questions about building Edge Software Hub solutions for edge compute node.
Announcements
Welcome to the Intel Community! If you like the answer to your question, please mark it as 'Accepted Solution' to help others.

The Edge Software Vision Package for Red Hat Enterprise Linux is now available here.

Learn more about developing on Intel® Hardware and Software here.
396 Discussions

unexpected DAPL connection event 0x4006 from 169

kunfu
Beginner
805 Views

Hi ,

 

I compiled my code using below dependencies 

intel-2018,gcc-8.2.0, fftw-3.3.8,hdf5-1.10.5,zlib-1.2.11,szlib-2.1.1 

my runs also get successful few times, but now I getting below error.

cn044:UCM:e33e:aaaedcc0: 568142013 us(568142013 us!!!): DTO completion ERR: status 12, op OP_RDMA_READ, vendor_err 0x85 - 0.0.0.0
[165:cn044][../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_poll_rc.c:1374] Intel MPI fatal error: ofa-v2-mlx5_0-1u DTO operation posted for [169:cn130] completed with error. status=0x8. cookie=0x0
[165:cn044][../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_poll_rc.c:1374] Intel MPI fatal error: ofa-v2-mlx5_0-1u DTO operation posted for [169:cn130] completed with error. status=0x1. cookie=0x400a9
[165:cn044] unexpected DAPL connection event 0x4006 from 169
Fatal error in MPI_Waitany: Internal MPI error!, error stack:
MPI_Waitany(253).........: MPI_Waitany(count=50, req_array=0x7fffffff7180, index=0x7fffffff8398, status=0x18149e0) failed
PMPIDI_CH3I_Progress(850): fail failed
(unknown)(): Internal MPI error!

I'm really appreciate if I get any suggestions.

I'm using pbspro 2021.1.0.20210303161351

 

Thanks In Advance...!

Labels (1)
0 Kudos
10 Replies
Athirah_Intel
Moderator
759 Views

Hi kunfu,


Thanks for reaching out. We are checking on this and will get back to you soon.



Regards,

Athirah


0 Kudos
kunfu
Beginner
712 Views

Hi Athirah,

 

I'll wait for your valuable response. I just want to know possible ways to resolve this .

 

Thanks

kunfu

 

 

0 Kudos
Iffa_Intel
Moderator
706 Views

Hi,


to investigate further, could you share:

  1. Your references to those steps that you did (url, etc)
  2. Your full steps with commands till point of error
  3. Your systems (OS, hardware:PC,etc)



Cordially,

Iffa


0 Kudos
kunfu
Beginner
699 Views

Hi Iffa, 

1. Your references to those steps that you did (url, etc)

https://orb5.epfl.ch/

 

2. Your full steps with commands till point of error

#!/bin/bash

## JOB NAME
#PBS -N n19
#PBS -N n19
## QUEUE NAME
#PBS -q mediumq
#PBS -q mediumq
## COMPUTE RESOURCES REQUESTED FOR THE JOB
#PBS -l select=32:ncpus=32
#PBS -l select=32:ncpus=32
## SPECIFY THE EXECUTION TIME LIMIT FOR THE CODE/APPLICATION IN HRS:MINS:SECS FORMAT
#PBS -l walltime=48:00:00
#PBS -l walltime=48:00:00
## JOIN THE OUTPUT AND ERROR FILES INTO A SINGLE FILE WITH NAME <JOBNAME>.O<JOBID>
#PBS -j oe
#PBS -j oe
## EXPORT ALL ENVIRONMENT VARIABLES
#PBS -V
#PBS -V
#EMAIL IS SENT WHEN THE JOB STARTS, TERMINATES AND ABORTS
#PBS -m bea
#PBS -m bea
## SPECIFY EMAIL ADDRESS FOR NOTIFICATIONS


# WORKING DIRECTORY OF CODE/APPLICATION
cd $PBS_O_WORKDIR
mpirun -np 1024 --machinefile $PBS_NODEFILE /home/user/Programs/orb5_intel/bin/orb5 >& orb5.out

 

3. Your systems (OS, hardware:PC,etc)

We are running this code in high performance computing cluster

which has rhel 7.5 os installed which having W2000h-W370h F4 chassis server and  infinity band switch edr 100 gb/s speed. 

 

Thanks
kunfu

0 Kudos
Iffa_Intel
Moderator
665 Views

Hi,


thank you for your patience.

This issue might relate to the MPI library or it could be something else.

We'll get back to you asap.



Cordially,

Iffa


0 Kudos
kunfu
Beginner
636 Views

Thanks Iffa ..!

 

I would appreciate If I get any solutions or suggestions.

0 Kudos
Iffa_Intel
Moderator
614 Views

Hi,



Could you clarify and confirm what Intel software you are using and check if it's actually an Intel Edge Controls for Industrial, then share to us?

 

As mentioned previously this might relates to Intel MPI library instead of Intel Edge Controls for Industrial.


Cordially,

Iffa


0 Kudos
kunfu
Beginner
600 Views

 

Yes Iffa I think you are right it might be related Intel MPI for this where will I get help.

 

Thanks

kunfu

0 Kudos
Iffa_Intel
Moderator
569 Views

 

You can contact the correct expert for MPI here: Intel® HPC Toolkit

 

 

Cordially,

Iffa

0 Kudos
Iffa_Intel
Moderator
484 Views

Hi,


Intel will no longer monitor this thread since we have provided a solution. If you need any additional information from Intel, please submit a new question. 


Cordially,

Iffa


0 Kudos
Reply