Could I get some help with how to analyze the performance of my code? I know I can make reports for FPGAs, but what do I do for GPUs to analyze my code?
Is there a better way to get performance output other than reports?
Please try this and let us know whether it helps: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance/code-profiling-scenarios/gpu-application-analysis.html
Thanks for reaching out to us.
There are some more links which you could refer,
Thank you so much guys for the quick reply! VTune seems like a very helpful tool! Unfortunately, I am unable to run it.
I am programming in DevCloud, and i went to install-dir/vtune/version/env/vars.sh. I sent in the command source vars.sh. After this, I am expecting that it should run the command vtune-gui, but for some reason it isn't. I get the following error:
vtune-gui: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
Could you help me figure this out?
EDIT: I get this error when i try to run the matrix multiply application. What can I do here?
vtune: Warning: Analysis result will not show the detailed GPU Utilization. See GPU Utilization help topic for more details.
Events for detailed GPU utilization analysis can be collected for Intel GPUs only.
vtune: Error: Cannot collect GPU hardware metrics due to a lack of permissions. Use root privileges (recommended) or re-configure your current permissions to make sure you are a member of the video user group and /proc/sys/dev/i915/perf_stream_paranoid value is set to 0.
vtune: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel module symbols.
For collecting gpu hotspots please use the below syntax
vtune -collect gpu-hotspots [-knob <knobName=knobValue>] -- <target> [target_options]
Please refer the below links for more information.
If you face any further issue please let us know.
Regarding your first query, the Intel(R) DevCloud is a cluster with no windows subsystem. By design GUI applications won't work.
You could use VTune as a web server in addition to the classic desktop usage model.
Please find the below link for more information.
Hope this helps.
That was very helpful! Thank you so much!
EDIT: I went over the links, they have the same steps for the matrix multiply code. They want me to run the code: vtune -collect gpu-hotspots -- ./matrix.dpcpp. However, when I run it I get the error of: vtune: Error: Cannot collect GPU hardware metrics due to a lack of permissions. Is there something I can do about this?
Make sure that you are login into a node with gpu on DevCloud.
You can request a node with GPU on the DevCloud using the below command
qsub -I -l nodes=1:gpu:ppn=2 -d .
Hope this helps. If you face any issue please let us know.