- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a frozen job running in DevCloud for 81 hours.
Apparently it was submitted to node s005-n005, but this node does not appear in the output of the pbsnodes command
I try to kill the job with:
qdel 18216.v-qsvr-fpga.aidevcloud
but I get this response:
qdel: Server could not connect to MOM 18216.v-qsvr-fpga.aidevcloud
What can I do?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My job is still frozen after 100 hours (4 days)
I've seen others having similar problems in the past.
As you can see, (https://helpful.knobs-dials.com/index.php/PBS_notes#qdel:_Server_could_not_connect_to_MOM) this is a known problem of PBS . This makes sense, since the node that is assigned to my job does not appear in the pbsnodes output anymore.
It looks like (with admin privileges) it could be simply solved by just executing
qdel -p 18216.v-qsvr-fpga.aidevcloud
Is there any sys-admin that can execute this command for me??
Anyone can help me? @Gopika_Intel ? @RaeesaM_Intel ? @AnilErinch_A_Intel ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting your query in Intel Devcloud.
Sorry for the inconvenience. From your log we could understand that you are working in FPGA devcloud. We’ve a dedicated team to handle FPGA related issues. We’re forwarding this query to that team for a faster response. This forum handles queries and issues related to OneAPI devcloud.
Regards
Gopika
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate very much your answer.
In fact, the only relation of the issue with FPGAs is that the PBS queue is targetting a node with FPGA support.
The problem is with the PBS software.
I also posted to "Application Acceleration With FPGAs" group, and already in this group "Intel High Level Design" some days ago, but noone answered.
There is an answer from @AnilErinch_A_Intel to the same question from @Ziaul some time ago, but he is referring to a broken link
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have forwarded your issue to the owner of this Dev Cloud platform and awaiting to hear back. I would request for them to answer to your post directly. Please give us a couple of days on this.
-Hazlina
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I 'm sorry to insist, but do you have any news @Hazlina_R_Intel ?
231 hours, and going...
v-qsvr-fpga.aidevcloud:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
18216.v-qsvr-fpga.aide u57927 batch build_stratix10. 116536 1 2 -- 06:00:00 R 231:41:49
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi David,
The platform owner has been responding to you on the other thread that you had. He also requested that you sent a direct email to this in the case you did not get a response: fpgauniversity@intel.com
I know you mentioned that you have sent a couple of emails to that inbox, please do follow-up from there. This inbox is the direct inbox for FPGA Dev Cloud platform support. Sorry for the delay here. The platform owner is trying their best to resolve this issue.
-Hazlina
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page