Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud
1638 Discussions

How can I check whether GPU is used in training and the usage rate of GPU?

ywang273
Beginner
1,203 Views

Hello, when I use coco data set training model, more than 10000 pictures show that I need to train for more than 100 hours. How can I check whether the training is running on CPU or GPU?How to know the usage rate of GPU At the same time, I use  "#PBS -l walltime=24:00:00" in the run.sh file. But I still can't change the time of walltime. What should I do? I look forward to your reply. Thank you!

0 Kudos
6 Replies
Srilekha_P_Intel
Employee
1,203 Views

Hi,

Thank you for reaching us.

We don't have GPU nodes but we have iGPU nodes on DevCloud, for requesting those nodes use the below command in the job script that you are submitting:

#PBS -l nodes=1:gpu

We are sorry to inform you that 24hrs is the max walltime possible in devcloud.

However , you can try the optimizations on CPU  itself to get improved performance.

Please follow the below urls for more details on Optimizing Tensorflow workloads on CPU.

https://software.intel.com/en-us/articles/maximize-tensorflow-performance-on-cpu-considerations-and-recommendations-for-inference

https://software.intel.com/en-us/articles/tips-to-improve-performance-for-popular-deep-learning-frameworks-on-multi-core-cpus

  • To submit a job in Devcloud

               qsub <job_script>.job

  • Once job is submitted, you can track the job using the below command:  

             qstat

  • To read the output and error stream of the executing job, you can use the qpeek command as below:

           1. qpeek -o <job_id>

           2. qpeek -e <job_id>

Please note that an output and error file will be created once the execution is completed.

Hope this clarifies your query. Please feel free to reach out to us if you have any further queries. Thank You.

0 Kudos
ywang273
Beginner
1,203 Views

okay, thank you. I also have a question, whether to use #PBS -l nodes=1:gpu to calculate on the igpu node, why does it feel like the speed of running it directly on the jupyter notebook without setting CPU / GPU? What's more, can I speed up the operation by changing the number of nodes? Can I know more about the speed of igpu?  Thank you for your reply!

0 Kudos
Srilekha_P_Intel
Employee
1,203 Views

Hi,

#PBS -l nodes=1:gpu is requesting an iGPU node,hence your code will be running in iGPU.

You can try to optimize your code and increase the speedup by tweaking  the OMP/KMP parameters for improving perfomance.

export OMP_NUM_THREADS="6"

export KMP_BLOCKTIME="0"

export KMP_SETTINGS="1"

export KMP_AFFINITY="granularity=fine,verbose,compact,1,0"

Refer:

https://software.intel.com/en-us/articles/tips-to-improve-performance-for-popular-deep-learning-frameworks-on-multi-core-cpus

Increasing the number of nodes may or maynot increase the speed based on your application.You could give a try running in a distributed way and see if that works.

0 Kudos
ChithraJ_Intel
Moderator
1,203 Views

Hi,

Could you please confirm whether the solution provided is helped for you.

0 Kudos
ywang273
Beginner
1,203 Views

After improvement, the speed is a little faster, but the problem of training time-out still hasn't been solved. I will improve the code again. Thank you. The topic can be closed.

0 Kudos
ChithraJ_Intel
Moderator
1,203 Views

Hi,

Thanks for the confirmation. We are closing the case.

Please raise a new thread if you have further issues.

0 Kudos
Reply