Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud

limit qsub tasks runing simultaneously

tripto__eitamar
Beginner
1,459 Views

hello

i have a few hundreds of jobs that I want to use the devcloud for.

every job creates temporary files and that causing the storage to reach its maximum and then kills all processes (if 10 jobs run simultaneously ).

I wanted to know if there is a way to define that only 2-3 jobs (submitting by qsub) will run simultaneously 

thank you

Eitamar

0 Kudos
8 Replies
JEYANTHKRI_N_Intel
1,459 Views
Hi , Thanks for reaching out to us. We hope that your job is storing something in your home folder. Thats why your storage space is occupied.Because,home folder is NFS-shared between the login node and the compute nodes. We recommend you to save the files in the /tmp folder of the compute node since the tmp folder gets cleared at end of every job completion. Please let us know if this resolves your issue.
0 Kudos
tripto__eitamar
Beginner
1,459 Views

thank you for your quick reply.

i have two followup questions:   

does the '/tmp' folder of the compute node have limitation on storage?

and is that storage part of the 200gb assign to every account?

0 Kudos
JEYANTHKRI_N_Intel
1,459 Views

Hi,

"/tmp" folder in compute node possess 28gb.It is not part of 200gb storage assigned to each account. It is shared across all compute nodes and 90% is already used.

For better utilisation of the storage assigned to you,  please include files deletion as part of the script and to restrict the number of jobs run at a time you can check out -t option in the following link.
( link : http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/commands/qsub.htm )

0 Kudos
tripto__eitamar
Beginner
1,459 Views

thanks!
I'll try the -t option once the current jobs end.

0 Kudos
JEYANTHKRI_N_Intel
1,459 Views

Hi ,

Could you please let us know if the solution provided helped? 

0 Kudos
tripto__eitamar
Beginner
1,459 Views

it seems like i don't understand the syntex of this commands, it didnt help me, i'll try to be clearer about my problem:

I have ~20 files named "launch{i}" (i=1,2,3...20)

 

#PBS
cd /home/u31124/
/usr/bin/time python script.py -p TARDBP -c 5 -A E

 

i am trying to run them using this loop file:

#!/bin/bash
for i in $(seq 1 26); do
        qsub launch$i
done

 

i would like to dubmit all 20 jobs but limit them to run 5 at a time.

when I changed the loop script to use the "-t" option like that:

#!/bin/bash
for i in $(seq 1 26); do
        qsub launch$i -t 0-10%4
done

I can submit only 9 jobs, and then get the response:

qsub: submit error (Maximum number of jobs already in queue for user MSG=total number of current user's jobs exceeds the queue limit: user u31124@login-1.aidevcloud, queue batch)
 

0 Kudos
Lakshmi_U_Intel
Employee
1,459 Views

Hi,

Could you please try the below snippet. We have verified the same for a single job and it is working as per your requirement. 

You just need to modify this code to run 5 jobs at a time.

 

#!/bin/bash
one=$(qsub script.sh)
echo $one 
for id in seq 2 4; do 
 two=$(qsub -W depend=afterok:$one script.sh)
 one=$two
done

Hope this helps. Thanks.

0 Kudos
JEYANTHKRI_N_Intel
1,459 Views

Hi,

Could you please let us know whether the issue is resolved now? 

0 Kudos
Reply