Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
40 Views

Trouble selecting GPU node on DevCloud

Hi, I have created a very simple DPC++ program. If I use the default device selection; sycl::queue myQueue( ); I get given the intel FPGA and code works fine. However if I try sycl::queue myQueue( sycl::gpu_selector{} ); with the command ./q run.sh where q = qsub script with qsub -l nodes=1:gpu:ppn=2 -d . I get the following:

u40402@s001-n049:~/cppVecAdd$ ./q run.sh
Submitting job:
589866.v-qsvr-1.aidevcloud
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
589795.v-qsvr-1            ...ub-singleuser u40402          00:00:44 R jupyterhub     
589866.v-qsvr-1            run.sh           u40402                 0 Q batch          
Waiting for Output.......
########################################################################
#      Date:           Tue May  5 00:07:23 PDT 2020
#    Job ID:           589866.v-qsvr-1.aidevcloud
#      User:           u40402
# Resources:           neednodes=1:gpu:ppn=2,nodes=1:gpu:ppn=2,walltime=06:00:00
########################################################################

:: setvars has already been run. Skipping any further invocation.  To force its re-execution, pass --force
########## Executing the run
########## Done with the run

########################################################################
# End of output for job 589866.v-qsvr-1.aidevcloud
# Date: Tue May  5 00:07:27 PDT 2020
########################################################################

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  No device of requested type available. Please check https://software.intel.com/en-us/articles/intel-oneapi-dpcpp-compiler-system-requirements-beta -1 (CL_DEVICE_NOT_FOUND)
/var/spool/torque/mom_priv/jobs/589866.v-qsvr-1.aidevcloud.SC: line 5: 13969 Aborted                 ./bin/cppVecAdd
 

How do I choose a GPU device instead of a FPGA device? Thanks

Tags (1)
0 Kudos
9 Replies
Highlighted
Moderator
40 Views

Hi,

Thanks for reaching out to us.

Please be informed that, s001-n049 is not a node with GPU.

 You can request the node with GPU on the DevCloud using the below command.

qsub -I -l nodes=1:gpu:ppn=2 -d .

 

Please try this and let us know if you have any further issues.

 

0 Kudos
Highlighted
Beginner
40 Views

Hi

Thank you. Ok I re-run my executable with the extra -I parameter:

                    qsub -I -l nodes=1:gpu:ppn=2 -d .

and in the code used

          sycl::queue myQueue( sycl::gpu_selector{} );

and then with

          sycl::queue myQueue; // default = FPGA

both times run the executable with
            ./q run.sh
Both timed out after 40secs
If I do not use the q script (which now uses ‘qsub -I -l nodes=1:gpu:ppn=2 -d . $script’)  the SYCL code works but on FPGA.
The extra parameter has not worked here. 
Anything thing else to try?
Thanks
0 Kudos
Highlighted
Beginner
40 Views

Hi,

I also tried the following. I typed the following on the command line

                qsub -I -l nodes=1:gpu:ppn=2 -d . run.sh

the terminal responsed with

                qsub: waiting for job 590100.v-qsvr-1.aidevcloud to start

and does not complete, it appears to hang

Again checking without using qsub just  ./run.sh the program runs outputting text to the command line.

 

 

0 Kudos
Highlighted
Beginner
40 Views

Hi, I am assuming this is a bug with DevCloud. Intel support has gone quiet. Trying to get instructions on 'qsub'. Man qsub does not work. Any information on how to use qsub?

0 Kudos
Highlighted
Beginner
40 Views

I tried again, went into the q script and edited the line qsub replacing -1/l with -l (L). Retyped nodes=1:gpu:ppn=2 -d . $script.

It worked!?! Not quite understanding if I changed anything.

########################################################################
#      Date:           Thu May  7 05:07:25 PDT 2020
#    Job ID:           591911.v-qsvr-1.aidevcloud
#      User:           u40403
# Resources:           neednodes=1:gpu:ppn=2,nodes=1:gpu:ppn=2,walltime=06:00:00
########################################################################

:: setvars has already been run. Skipping any further invocation.  To force its re-execution, pass --force
########## Executing the run
Device name         : Intel(R) Gen9 HD Graphics NEO
Vendor name         : Intel(R) Corporation
Driver version      : 20.12.16259
Address bits        : 64
Max work-group size : 256

0 Kudos
Highlighted
Moderator
40 Views

Hi,

Glad to know that your issue got resolved.

qsub -I -l nodes=1:gpu:ppn=2 -d .

This command is used to request the node with GPU on the DevCloud 

qsub -l nodes=1:gpu:ppn=2 -d . run.sh

This command is used for running shell script in GPU node from login node.

Sometimes  the login  may show waiting due to unavailability of GPU nodes. Try  after sometime resolves this issue.

Since your issue got resolved, can we close this thread?

 

Thanks

 

 

 

0 Kudos
Highlighted
Beginner
40 Views

Ok, after some more usage on off through out yesterday this is still problematic. I still more often than not get timeouts using the same qsub command and parameters you have given. It is not 100%, it is a gamble. Is there a way to guarantee get this to work every time?

0 Kudos
Highlighted
Moderator
40 Views

Hi,

Kindly check if any node with GPU is free(using pbsnodes command), when the qsub command fails.

The list of the nodes with GPU can be obtained by the below command:

pbsnodes | grep -B4 "gen9"

You can request a specific node which is free using the below command

qsub -I -l nodes=s001-nxxx:gpu:ppn=2 -d .

Note: Please change nxxx to available free node number

For example: 

qsub -I -l nodes=s001-n233:gpu:ppn=2 -d .

Hope this helps.

 

Thanks

 

 

0 Kudos
Highlighted
Moderator
40 Views

Hi,

We are closing this case assuming the solution provided was helpful. Please feel free to open a new thread in case of further queries.

 

Thanks

 

0 Kudos