Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Kevin_H_3
Beginner
120 Views

Immediate Logout

I'm getting an almost immediate logout of an interactive job. (within less than a minute) It makes using the DevCloud impossible.

 

uXXXXX@login-2:~$ qsub -lnodes=1:gpu -lwalltime=02:00:00 -d. -I

qsub: waiting for job 490970.v-qsvr-1.aidevcloud to start

qsub: job 490970.v-qsvr-1.aidevcloud ready

 

 

########################################################################

#      Date:           Thu Feb 13 11:32:39 PST 2020

#    Job ID:           490970.v-qsvr-1.aidevcloud

#      User:           uXXXXX

# Resources:           neednodes=1:gpu,nodes=1:gpu,walltime=02:00:00

########################################################################

 

uXXXXX@s001-n141:~$ qstat

Job ID                    Name             User            Time Use S Queue

------------------------- ---------------- --------------- -------- - -----

490970.v-qsvr-1            STDIN            uXXXXX                 0 R batch          

uXXXXX@s001-n141:~$ qstat -l -f

Job Id: 490970.v-qsvr-1.aidevcloud

    Job_Name = STDIN

    Job_Owner = uXXXXX@login-2.aidevcloud

    job_state = R

    queue = batch

    server = v-qsvr-1.aidevcloud

    Checkpoint = u

    ctime = Thu Feb 13 11:32:37 2020

    Error_Path = /dev/pts/0

    exec_host = s001-n141/0

    Hold_Types = n

    interactive = True

    Join_Path = n

    Keep_Files = n

    Mail_Points = n

    mtime = Thu Feb 13 11:32:39 2020

    Output_Path = /dev/pts/0

    Priority = 0

    qtime = Thu Feb 13 11:32:37 2020

    Rerunable = False

    Resource_List.nodect = 1

    Resource_List.nodes = 1:gpu

    Resource_List.walltime = 02:00:00

    session_id = 9193

    Variable_List = PBS_O_QUEUE=batch,PBS_O_HOME=/home/uXXXXX,

PBS_O_LOGNAME=uXXXXX,

PBS_O_PATH=/glob/development-tools/versions/intel-parallel-studio/com

pilers_and_libraries_2019.3.199/linux/bin/intel64:/glob/development-to

ols/versions/intel-parallel-studio/compilers_and_libraries_2019.3.199/

linux/mpi/intel64/libfabric/bin:/glob/development-tools/versions/intel

-parallel-studio/compilers_and_libraries_2019.3.199/linux/mpi/intel64/

bin:/glob/development-tools/versions/intel-parallel-studio/debugger_20

19/gdb/intel64/bin:/glob/intel-python/python3/bin/:/glob/intel-python/

python2/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/

bin:/usr/games:/usr/local/games:/home/uXXXXX/.local/bin:/home/uXXXXX/b

in:/bin,PBS_O_MAIL=/var/mail/uXXXXX,PBS_O_SHELL=/bin/bash,

PBS_O_LANG=en_US,

PBS_O_SUBMIT_FILTER=/usr/local/sbin/torque_submitfilter,

PBS_O_INITDIR=/home/uXXXXX/.,PBS_O_WORKDIR=/home/uXXXXX,

PBS_O_HOST=login-2.aidevcloud,PBS_O_SERVER=v-qsvr-1.aidevcloud

    euser = uXXXXX

    egroup = uXXXXX

    queue_type = E

    etime = Thu Feb 13 11:32:37 2020

    submit_args = -lnodes=1:gpu -lwalltime=02:00:00 -d. -I

    start_time = Thu Feb 13 11:32:39 2020

    Walltime.Remaining = 7187

    start_count = 1

    fault_tolerant = False

    job_radix = 0

    submit_host = login-2.aidevcloud

Tags (1)
0 Kudos
6 Replies
AthiraM_Intel
Moderator
120 Views

Hi,

Thanks for reaching out to us.

Could you please try to login into gpu node using the below command and see if the issue still persists:

qsub -I -l nodes=1:gpu:ppn=2 -d.

You can also request a specific node using the below command

qsub -I -l nodes=s001-n208:gpu:ppn=2 -d.

The list of the nodes with GPU can be obtained by the next command:

pbsnodes | grep -B4 "gen9"

Hope this helps. If you face any further issue, please let us know.

0 Kudos
Kevin_H_3
Beginner
120 Views

Thanks, specifying ppn=2 prevents the logout. Seems odd that ppn is required on an interactive login session. I'm not sure how to interpret that parameter under this condition.

kevin

 

0 Kudos
AthiraM_Intel
Moderator
120 Views

Hi,

Glad to know that your issue got resolved.

qsub -I -l nodes=1:gpu:ppn=2 -d.  

In the above command, ppn=2 would request 2 processors on one node.

Please refer the below link for more details on job submission.

 http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php 

0 Kudos
AthiraM_Intel
Moderator
120 Views

Hi,

Could you please confirm whether the solution provided was helpful.

0 Kudos
Kevin_H_3
Beginner
120 Views

Yes, setting ppn=2 prevents the logout and i can use the compute node normally.

kevin

 

0 Kudos
AthiraM_Intel
Moderator
120 Views

Hi,

Thanks for your confirmation.

We are closing this thread. Please feel free to open a new thread if you have any further queries.

0 Kudos