I have two question about what I see in the profile log.
I'm using DPC++ compiler with Intel OneAPI.
OS: Windows 10 Home (64bit)
CPU: Intel Corei7-1065G7 1.3GHz
GPU: Intel Iris Plus Graphics
I have installed Intel oneAPI basetoolkit beta Update 8.
I ran the following command.
test_matrix.exe is my executable file created by DPC++ compiler.
advixe-cl --collect=roofline --profile-gpu --project-dir=C:\Test\Release --search-dir src:r=C:\Test\src -- C:\Test\Release\test_matrix.exe
The following warning is displayed in survey analysis.
advixe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: advixe-cl -r C:\Test\Release\e000\hs008 -command stop. advixe: Warning: [Instrumentation Engine]: GTPin: GTPin didn't find any kernels... Exiting without doing anything. advixe: Collection stopped.
・What is the cause of this?
・In my executable file, matrix operation is executed by GPU (DPC++) parallel processing.
Is it not profiled correctly?
The value of GFLOPS displayed in the log is 0,
about survey analysis and tripcounts analysis.
Output log example:
Elapsed Time: 5.23s Total CPU time: 3.83101 Time in 1 vectorized loop: 0.298428 GFLOPS: 0
・Is it not profiled correctly?
Is there a way to make sure it is correct?
Regarding your first question: this is fine for the first 'survey' step of the collection.
For the second one: what is the size of the multiplied matrices? Please note, that the kernel has to run at least 10ms (longer is better).
Elapsed Time: 14.74s Total CPU time: 12.5268 Time in 2 vectorized loops: 12.02 GFLOPS: 0
Can you share your source code?
Advisor needs 10ms in order to have at least a couple of the time sampling hits inside the kernel. In other words to have more reliable results.
I attach the source code zip file. (TestCodeDCP_IntelAdvisor.zip)
Development: Microsoft VisualStudio Professional 2019 Version 16.5.5
I want to measure the performance of the following GPU parallel processing part.