Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.

Frozen Job in DevCloud

davidcastells
New Contributor I
449 Views

I have a frozen job running in DevCloud for 81 hours.

Apparently it was submitted to node s005-n005, but this node does not appear in the output of the pbsnodes command

I try to kill the job with:

qdel 18216.v-qsvr-fpga.aidevcloud

but I get this response:
qdel: Server could not connect to MOM 18216.v-qsvr-fpga.aidevcloud

 

What can I do?

0 Kudos
7 Replies
davidcastells
New Contributor I
424 Views

My job is still frozen after 100 hours (4 days)

I've seen others having similar problems in the past.

As you can see, (https://helpful.knobs-dials.com/index.php/PBS_notes#qdel:_Server_could_not_connect_to_MOM) this is a known problem of PBS . This makes sense, since the node that is assigned to my job does not appear in the pbsnodes output anymore.

It looks like (with admin privileges) it could be simply solved by just executing 

  qdel -p 18216.v-qsvr-fpga.aidevcloud

Is there any sys-admin that can execute this command for me?? 

Anyone can help me? @Gopika_Intel ? @RaeesaM_Intel ? @AnilErinch_A_Intel ?

Gopika_Intel
Moderator
415 Views

Hi,

 

Thank you for posting your query in Intel Devcloud.

 

Sorry for the inconvenience. From your log we could understand that you are working in FPGA devcloud. We’ve a dedicated team to handle FPGA related issues. We’re forwarding this query to that team for a faster response. This forum handles queries and issues related to OneAPI devcloud.

 

Regards

Gopika


davidcastells
New Contributor I
404 Views

I appreciate very much your answer.

In fact, the only relation of the issue with FPGAs is that the PBS queue is targetting a node with FPGA support.

The problem is with the PBS software. 

I also posted to "Application Acceleration With FPGAs" group, and already in this group "Intel High Level Design" some days ago, but noone answered.

There is an answer from @AnilErinch_A_Intel  to the same question from @Ziaul  some time ago, but he is referring to a broken link  

Hazlina_R_Intel
Moderator
392 Views

Hi,

I have forwarded your issue to the owner of this Dev Cloud platform and awaiting to hear back. I would request for them to answer to your post directly. Please give us a couple of days on this.


-Hazlina


davidcastells
New Contributor I
391 Views
davidcastells
New Contributor I
372 Views

I 'm sorry to insist, but do you have any news @Hazlina_R_Intel ?

231 hours, and going...

 

v-qsvr-fpga.aidevcloud:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
18216.v-qsvr-fpga.aide u57927 batch build_stratix10. 116536 1 2 -- 06:00:00 R 231:41:49

 

Hazlina_R_Intel
Moderator
365 Views

Hi David,

The platform owner has been responding to you on the other thread that you had. He also requested that you sent a direct email to this in the case you did not get a response:  fpgauniversity@intel.com


I know you mentioned that you have sent a couple of emails to that inbox, please do follow-up from there. This inbox is the direct inbox for FPGA Dev Cloud platform support. Sorry for the delay here. The platform owner is trying their best to resolve this issue.


-Hazlina


Reply