Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud
1637 Discussions

Ice Lake nodes unavailable

venovako
Novice
1,891 Views

It seems that the Ice Lake nodes (with properties icx,plat8380,batch) are down and unavailable to run jobs.

When are they expected to become available again?

Labels (1)
0 Kudos
1 Solution
VaradJ_Intel
Moderator
1,350 Views

Hi,

 

Good day to you

 

We currently have only one Icelake node with plat 8380 configuration working, that is node s042-n005.

 

Before connecting to it you can first check its state using the following command:

 

pbsnodes | grep -B4 plat8380 | grep s042-n005 -A2

 

Now, if the 'state = free' and 'power_state = Running' you can connect to it in Interactive mode with the following command:

 

qsub -I -l nodes=s042-n005:ppn=2 -d .

 

Also, you can connect to it in Batch mode using the following command:

 

qsub -l nodes=s042-n005:ppn=2 <job file>

 

 

Please do let us know if you still face any issues.

 

If this resolves your issue, make sure to accept this as a solution. This would help others with similar issues. 

 

Thank You!

 

View solution in original post

0 Kudos
10 Replies
VaradJ_Intel
Moderator
1,850 Views

Hi,


Thanks for posting in Intel communities.


Thanks for reporting this issue. We have informed the development team about it.


Thank You.


0 Kudos
VaradJ_Intel
Moderator
1,836 Views

Hi,


The issue is resolved now. Please can you check and confirm it from your side.


Thank You.



0 Kudos
venovako
Novice
1,831 Views

Hi,

 

There are still some Ice Lake nodes with status=down.

Furthermore, when a job is submitted with "-l nodes=1:plat8380:ppn=2" it gets queued but disappears from the qstat output quickly, without being launched.  No job output and error files are generated.

 

I hope this helps; please let me know if you need more details about the job.

0 Kudos
venovako
Novice
1,818 Views

P.S. More info if you find it useful:

When, instead of submitting a script, an interactive session is requested:

 

 

qsub -I -l nodes=1:plat8380:ppn=2

 

 

the job gets deleted from the queue quickly:

 

 

qsub: waiting for job 1918161.v-qsvr-1.aidevcloud to start
qsub: job 1918161.v-qsvr-1.aidevcloud apparently deleted

 

 

But if "icx" is requested instead of the property "plat8380",

 

 

qsub -I -l nodes=1:icx:ppn=2

 

 

the output is:

 

 

qsub: waiting for job 1918162.v-qsvr-1.aidevcloud to start
qsub: job 1918162.v-qsvr-1.aidevcloud ready

########################################################################
#      Date:           Wed 01 Jun 2022 03:46:21 PM PDT
#    Job ID:           1918162.v-qsvr-1.aidevcloud
#      User:           u153810
# Resources:           neednodes=1:icx:ppn=2,nodes=1:icx:ppn=2,walltime=06:00:00
########################################################################

 

 

and the control is not returned to the shell, i.e., there is no further response until something that is preventing the shell prompt to appear is killed by Ctrl+C.

After that, the shell on the node s002-n001 displays the prompt and my executables can be launched normally.

P.P.S. However, that particular node has Platinum 8358, not 8380 CPUs... will the latter also become available?

0 Kudos
VaradJ_Intel
Moderator
1,798 Views

Hi,


Thank you for giving us feedback. We are working on this internally. We will get back with an update soon.


Thank You.


0 Kudos
venovako
Novice
1,735 Views

Are there any updates on this?

 

Thank you.

0 Kudos
VaradJ_Intel
Moderator
1,728 Views

Hi,

 

At this moment there is no visibility when it will be implemented and available for use.

We apologize for the inconvenience.

 

Thank You

 

0 Kudos
VaradJ_Intel
Moderator
1,351 Views

Hi,

 

Good day to you

 

We currently have only one Icelake node with plat 8380 configuration working, that is node s042-n005.

 

Before connecting to it you can first check its state using the following command:

 

pbsnodes | grep -B4 plat8380 | grep s042-n005 -A2

 

Now, if the 'state = free' and 'power_state = Running' you can connect to it in Interactive mode with the following command:

 

qsub -I -l nodes=s042-n005:ppn=2 -d .

 

Also, you can connect to it in Batch mode using the following command:

 

qsub -l nodes=s042-n005:ppn=2 <job file>

 

 

Please do let us know if you still face any issues.

 

If this resolves your issue, make sure to accept this as a solution. This would help others with similar issues. 

 

Thank You!

 

0 Kudos
venovako
Novice
1,341 Views

Thank you, interactive login to that node works.

0 Kudos
VaradJ_Intel
Moderator
1,336 Views

Hi,


Good day to you


Thanks for accepting our solution. 


If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Thank You!


0 Kudos
Reply