Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
674 Discussions

Basekit vector-add example fails on epilogue

Victor_E_1
Beginner
395 Views
On https://devcloud.intel.com/oneapi/get-started/base-toolkit/

the build & run script both give a PBS error:

/var/spool/torque/mom_priv/epilogue.parallel: line 12: /var/spool/torque/mom_priv/epilogue.d//95-nvdir.epilogue: Permission denied

 

0 Kudos
7 Replies
RahulV_intel
Moderator
395 Views

Hi Victor,

We are unable to recreate the issue. The vector add sample works fine for us in DevCloud. Did you follow the exact steps mentioned in the https://devcloud.intel.com/oneapi/get-started/base-toolkit/ ?

PFA screenshot for reference.vec-add-success.png

Victor_E_1
Beginner
395 Views

You are running interactively. Please read my question: I am making PBS scripts as requested. Those don't work.

 

%%%%%%%%%

 

Build and run the sample in batch mode

The following describes the process of submitting build and run jobs to PBS.

A job is a script that is submitted to PBS through the qsub utility. By default, the qsub utility does not inherit the current environment variables or your current working directory. For this reason, it is necessary to submit jobs as scripts that handle the setup of the environment variables. In order to address the working directory issue, you can either use absolute paths or pass the -d <dir> option to qsub to set the working directory.

Create the job scripts

Create a build.sh script with the following contents.

#!/bin/bash source /opt/intel/inteloneapi/setvars.sh make clean make all

Bash

Copy

Create a run.sh script with the following contents for executing the sample.

#!/bin/bash source /opt/intel/inteloneapi/setvars.sh make run

Bash

Copy

Build and run

Jobs submitted in batch mode are placed in a queue waiting for the necessary resources (compute nodes) to become available. The jobs will be executed on a first come basis on the first available node(s) having the requested property or label.

Build the sample on a gpu node.

qsub -l nodes=1:gpu:ppn=2 -d . build.sh

Bash

Copy

Note: -l nodes=1:gpu:ppn=2 (lower case L) is used to assign one full GPU node to the job.
Note: The -d . is used to configure the current folder as the working directory for the task.

In batch mode, the commands return immediately; however, the job itself may take longer to complete. In order to inspect the job progress, use the qstat utility.

watch -n 1 qstat -n -1

Bash

Copy

Note: The watch -n 1 command is used to run qstat -n -1 and display its results every second.

Run the sample on a gpu node after the build job completes successfully.

qsub -l nodes=1:gpu:ppn=2 -d . run.sh

Bash

Copy

The best way to determine whether a job completed or not is by using the qstat utility. When a job terminates, a couple of files are written to the disk:

RahulV_intel
Moderator
395 Views

Hi,

We are able to build/run the vector-add sample using non-interactive mode(PBS/Batch mode). PFA screenshots for reference(zip file). We suggest you to try running the sample using the interactive mode and let us know if it works. Also make sure that you are following exact steps from this link: https://devcloud.intel.com/oneapi/get-started/base-toolkit/#cpu-gpu-vector-add-sample-walkthrough.

GouthamK_Intel
Moderator
395 Views

Hi Victor,

Can you please provide us the update on whether you were able to run the sample code with suggested steps by Rahul.

Please let us know whether your issue is resolved or not.

 

Regards

Goutham

Victor_E_1
Beginner
395 Views

Batch jobs still give an error:

 

```

u37709@login-2:~/sycltrain/9_sycl_of_hell$ pwd
/home/u37709/sycltrain/9_sycl_of_hell

u37709@login-2:~/sycltrain/9_sycl_of_hell$ cat build.sh
#!/bin/bash
source /opt/intel/inteloneapi/setvars.sh
make
u37709@login-2:~/sycltrain/9_sycl_of_hell$ cat *e*94
/var/spool/torque/mom_priv/epilogue.parallel: line 12: /var/spool/torque/mom_priv/epilogue.d//95-nvdir.epilogue: Permission denied

```

 

(how does this markup system work? is there a way to mark a block of text as literal/monospace/no-html?)

 

RahulV_intel
Moderator
395 Views

Hi Victor,

The epilogue error has been fixed now (which was persisting previously on few of the nodes). Could you try the sample now and check if it works? Let us know in case you face any issues.

 

-Rahul

 

 

RahulV_intel
Moderator
395 Views

Hi,

We are closing this thread, assuming that your issue is resolved. Feel free to post a new question if your issue still persists.

 

--Rahul

Reply