- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm getting an almost immediate logout of an interactive job. (within less than a minute) It makes using the DevCloud impossible.
uXXXXX@login-2:~$ qsub -lnodes=1:gpu -lwalltime=02:00:00 -d. -I
qsub: waiting for job 490970.v-qsvr-1.aidevcloud to start
qsub: job 490970.v-qsvr-1.aidevcloud ready
########################################################################
# Date: Thu Feb 13 11:32:39 PST 2020
# Job ID: 490970.v-qsvr-1.aidevcloud
# User: uXXXXX
# Resources: neednodes=1:gpu,nodes=1:gpu,walltime=02:00:00
########################################################################
uXXXXX@s001-n141:~$ qstat
Job ID Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
490970.v-qsvr-1 STDIN uXXXXX 0 R batch
uXXXXX@s001-n141:~$ qstat -l -f
Job Id: 490970.v-qsvr-1.aidevcloud
Job_Name = STDIN
Job_Owner = uXXXXX@login-2.aidevcloud
job_state = R
queue = batch
server = v-qsvr-1.aidevcloud
Checkpoint = u
ctime = Thu Feb 13 11:32:37 2020
Error_Path = /dev/pts/0
exec_host = s001-n141/0
Hold_Types = n
interactive = True
Join_Path = n
Keep_Files = n
Mail_Points = n
mtime = Thu Feb 13 11:32:39 2020
Output_Path = /dev/pts/0
Priority = 0
qtime = Thu Feb 13 11:32:37 2020
Rerunable = False
Resource_List.nodect = 1
Resource_List.nodes = 1:gpu
Resource_List.walltime = 02:00:00
session_id = 9193
Variable_List = PBS_O_QUEUE=batch,PBS_O_HOME=/home/uXXXXX,
PBS_O_LOGNAME=uXXXXX,
PBS_O_PATH=/glob/development-tools/versions/intel-parallel-studio/com
pilers_and_libraries_2019.3.199/linux/bin/intel64:/glob/development-to
ols/versions/intel-parallel-studio/compilers_and_libraries_2019.3.199/
linux/mpi/intel64/libfabric/bin:/glob/development-tools/versions/intel
-parallel-studio/compilers_and_libraries_2019.3.199/linux/mpi/intel64/
bin:/glob/development-tools/versions/intel-parallel-studio/debugger_20
19/gdb/intel64/bin:/glob/intel-python/python3/bin/:/glob/intel-python/
python2/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/
bin:/usr/games:/usr/local/games:/home/uXXXXX/.local/bin:/home/uXXXXX/b
in:/bin,PBS_O_MAIL=/var/mail/uXXXXX,PBS_O_SHELL=/bin/bash,
PBS_O_LANG=en_US,
PBS_O_SUBMIT_FILTER=/usr/local/sbin/torque_submitfilter,
PBS_O_INITDIR=/home/uXXXXX/.,PBS_O_WORKDIR=/home/uXXXXX,
PBS_O_HOST=login-2.aidevcloud,PBS_O_SERVER=v-qsvr-1.aidevcloud
euser = uXXXXX
egroup = uXXXXX
queue_type = E
etime = Thu Feb 13 11:32:37 2020
submit_args = -lnodes=1:gpu -lwalltime=02:00:00 -d. -I
start_time = Thu Feb 13 11:32:39 2020
Walltime.Remaining = 7187
start_count = 1
fault_tolerant = False
job_radix = 0
submit_host = login-2.aidevcloud
- Tags:
- General Support
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
Could you please try to login into gpu node using the below command and see if the issue still persists:
qsub -I -l nodes=1:gpu:ppn=2 -d.
You can also request a specific node using the below command
qsub -I -l nodes=s001-n208:gpu:ppn=2 -d.
The list of the nodes with GPU can be obtained by the next command:
pbsnodes | grep -B4 "gen9"
Hope this helps. If you face any further issue, please let us know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, specifying ppn=2 prevents the logout. Seems odd that ppn is required on an interactive login session. I'm not sure how to interpret that parameter under this condition.
kevin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Glad to know that your issue got resolved.
qsub -I -l nodes=1:gpu:ppn=2 -d.
In the above command, ppn=2 would request 2 processors on one node.
Please refer the below link for more details on job submission.
http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you please confirm whether the solution provided was helpful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, setting ppn=2 prevents the logout and i can use the compute node normally.
kevin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for your confirmation.
We are closing this thread. Please feel free to open a new thread if you have any further queries.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page