Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2245 Discussions

dapl async_event CQ (0x1fdb8e0) ERR 0

Ambika1
Beginner
2,961 Views

Dear Support official,

 

I am facing a problem while submitting jobs on an HPC . We are submitting using Slurm but the MPI assignment is being done through mpiexec.hydra.

When we are assigning large number of n-task the simulation stops midway and shows the following comments in the log, after which the simulation does not progress any further.

For e.g. while submitting a job with 240 tasks and memory  of more than 17GB per task (run.script is attached for your kind reference) we get this following error:

 

"hm010:UCM:dec1:1f7d2700: 781258369 us(781258369 us!!!): dapl async_event CQ (0x1fdb8e0) ERR 0
hm010:UCM:dec1:1f7d2700: 781258411 us(42 us): -- dapl_evd_cq_async_error_callback (0x1f7d700, 0x18171f0, 0x2b351f7d1d30, 0x1fdb8e0)
hm010:UCM:dec1:1f7d2700: 781258429 us(18 us): dapl async_event QP (0x55c0cf0) Event 1"

 

Can you please help us in resolving this problem? I am  a novice in this field.

 

Thank you.

 

Sincerely,

Ambika

 

 

0 Kudos
10 Replies
RabiyaSK_Intel
Employee
2,930 Views

Hi,


Thanks for posting in Intel Communities.


Could you please provide the following information so we could reproduce your issue at our end:

1. Are you using Intel OneAPI toolkits? If so, please specify the version of the toolkit.

2. OS, CPU and hardware details

3. Could you please provide a sample reproducer along with the steps to reproduce?


Thanks & Regards,

Shaik Rabiya


0 Kudos
Ambika1
Beginner
2,917 Views

Dear M(r)s Rabiya,

 

Please find my response to your queries below:

 

1. I am using a software QuantumATK which utilizes Intel's mpiexec.hydra for mpi processes. I am not sure if it comes under OneAPI.

2. I am trying to run simulations on an HPC with centos 7. The system has 384 nodes with the following configuration for each node:

    2* Intel Xeon SKL G-6148, Cores = 40, 2.4GHz, Memory= 192 GB, DDR4 2666 MHz.

3. Since I am using a proprietary software I wont be able to send you a sample script, however I am readily willing to share my screen/computer through a suitable app such as Anydesk and collaborate, for you to look into the issue.

 

I have no experience and knowledge in MPI coding and thus am solely depending on your help.

 

Thank you very much.

Sincerely

Ambika.

 

0 Kudos
RabiyaSK_Intel
Employee
2,867 Views

Hi,

 

>>>I am using a software QuantumATK which utilizes Intel's mpiexec.hydra for mpi processes. I am not sure if it comes under OneAPI.

Could you please confirm which version of Intel OneAPI HPC toolkit and Intel OneAPI MPI Library you are using?

 

>>>I am trying to run simulations on an HPC with centos 7.

If you are using latest Intel MPI Library for mpiexec.hydra, the centOS 7 is not a supported Operating System. Please go through the below links and try on supported hardware and let us know if you face any issues:

 

MPI Library System Requirements:

https://www.intel.com/content/www/us/en/developer/articles/system-requirements/mpi-library-system-requirements.html

 

HPC Toolkit System Requirements:

https://www.intel.com/content/www/us/en/developer/articles/system-requirements/intel-oneapi-hpc-toolkit-system-requirements.html

 

>>>Since I am using a proprietary software I wont be able to send you a sample script, however I am readily willing to share my screen/computer through a suitable app such as Anydesk and collaborate, for you to look into the issue.

For Community support we would require you to submit a minimum reproduction sample code specific to your issue that provides us the most relevant background information for triage. In case you require privacy, and unable to share the issue / sample with us publicly and if you are a licensed oneAPI product customer and/or member of Intel’s oneAPI Academic Program please submit a ticket for Priority support so that your application can be handled with the required data protection and privacy regulations.

 

Thanks & Regards,

Shaik Rabiya

 

0 Kudos
Ambika1
Beginner
2,827 Views

Dear M(r)s Rabiya,

 

Thank you very much. I will contact with the HPC administrator to find out about the Intel OneAPI subscriptions and let you know asap.

0 Kudos
RabiyaSK_Intel
Employee
2,718 Views

Hi,


We haven't heard back from you. We are waiting for your response to our previous reply. 


Could you please provide all the necessary details that we requested?


Thanks & Regards,

Shaik Rabiya


0 Kudos
Ambika1
Beginner
2,702 Views

Dear M(r)s Rabiya,

 

QuantumATK is compatible with CentOS 7 because it is 2019 version.

 

I have attached the run script , to run this script you need license of QuantumATK. It is available in our institute and not accessible from outside. You can access it by using our system.

 

Initially we had academic subscription of Intel but now its contract term is over. However subscription would be probabily initiated by HPC administrator.

0 Kudos
Ambika1
Beginner
2,698 Views

Dear M(r)s Rabiya,

We are not using mpiexec.hydra of Intel. QuantumATK have its own mpiexec.hydra.

If you can solve this issue kindly help us to resolve this problem

 

0 Kudos
RabiyaSK_Intel
Employee
2,644 Views

Hi,


We can only offer direct support for Intel hardware platforms that the Intel® oneAPI product supports. 


Could you please confirm which component of Intel is being used here?


Thanks & Regards,

Shaik Rabiya


0 Kudos
RabiyaSK_Intel
Employee
2,549 Views

Hi,


We haven't heard back from you. Could you please respond to my previous post?


Thanks & Regards,

Shaik Rabiya


0 Kudos
RabiyaSK_Intel
Employee
2,392 Views

Hi,


We haven't heard back from you. If you need any additional information, you can raise a new question as this thread will no longer be monitored by Intel.


Thanks & Regards,

Shaik Rabiya


0 Kudos
Reply