please I need help please I have a problem with job assignment on the command line. this is the error I get
/var/spool/torque/mom_priv/jobs/566941.v-qsvr-1.aidevcloud.SC: line 2: violenceData03.py: command not found
- General Support
Thanks for reaching out to us.
We have some trouble in understanding your query.Could you please give more details about your issue such as steps you followed, error logs obtained & what kind of workload that you are trying out. Kindly attach the screenshots, if possible. So that, we would be able to help you better.
I have been running that my python script already for two days with the command line crashing at times with the
client_loop: send disconnect: Connection reset by peer
client_loop: send disconnect: Broken pipe
Could you please confirm whether you are still facing the below issue
"/var/spool/torque/mom_priv/jobs/566941.v-qsvr-1.aidevcloud.SC: line 2: violenceData03.py: command not found". If not, can you follow the steps mentioned below to solve your issue attached in the screenshot(cloud.PNG).
Step 1: Create a job file.
Step 2: Please add the below commands in the job script
#PBS -l walltime=24:00:00 echo python violenceData.py
Step 3. Save the file
Press [Esc] to shift to the command mode and press :wq! and hit [Enter] to save the file.
Step 4: Run the job script as:
nohup qsub myjob &
You can view the output of your script in the nohup.out file, which is created while running the job.
Hope this helps.
Since the job script you submitted includes the command "echo python violenceData.py" ,it will simply print the given value to the console like a print function. So, in order to train your deep learning model, edit the job script as follows:
#PBS -l walltime=24:00:00 python violenceData.py
Then, submit the job script as mentioned above( Step 4). Once your job has been submitted,you will get a Job ID which is the tracking number for your job. You can track the status of the job with:
Once it is complete, the output will be in the files:
cat myjob.oXXXXXX cat myjob.eXXXXXX
Here ‘XXXXXX’ is the Job ID. The .o file contains the standard output stream, and .e file contains the error stream.
Since you are running the job with nohup command, it will logs everything to an output file nohup.out, even if you exit the shell or terminal. So that, you can view the status of the DL model training or any error logs obtained in the nohup.out file.
Hopes this helps.