If I request the node with GPU on the DevCloud using the command
qsub -I -l nodes=1:gpu:ppn=2 -d .
and the first node in the list is not available, the qsub command fails and I get no access, but there are other nodes with GPU available actually.
The list of the nodes with GPU can be obtained by the next command:
pbsnodes | grep -B4 "gen9"
It would be better to improve the logic of selecting nodes: if the first one is not accessible (for some reason), the search continues until the first available node is found.
Thank you for reaching out to us.
Please be informed that, in Devcloud while requesting the node with GPU we will get the node which is free at that moment from the list of nodes. Kindly check if any node with GPU is free(using pbsnodes command), when the qsub command fails.
You can also request a specific node using the below command
qsub -I -l nodes=s001-n158 :gpu:ppn=2 -d .
I apologize for the delay. It is ok if we really get the first free node. But last week, when I tried to use the command from my first post, this command failed. When I requested the list of the nodes with GPU using pbsnodes command and saw that the first node in the list is not available, but the others are available. After that I received the particular node with GPU as you described above.
Possibly, there is a bug preventing to get the first available node when the first one is actually not available. This can bring difficulties to other users requesting nodes.
Thanks for the update Micheal. We haven't observed any such issue from our end. Next time when you observe such issues,please provide relevant screenshots so that we can report to the concerned team.
Kindly let us know if the solution provide helped so that we can close this case.