Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
718 Discussions

Measuring Time inside kernel on intel GPU(P630)

PcDack1
New Contributor I
1,880 Views

I have fair amount of experience on GPU programming using CUDA. I used to use clock function inside CUDA kernel (as mentioned in here) to measure ticks of certain operations inside the kernel. I wrote a simple oneAPI kernel and tried to run it on the intel integrated GPU. Errors will be generated.

__kernel void testVecAdd(__global const int *a,__global const int *b,__global int *c,
                 __global float *t){

clock_t start = clock();

 int gid = get_global_id(0);
 c[gid] = a[gid] + b[gid];

 t[gid] = (float)(clock()-start)/CLOCKS_PER_SEC;

}

error info:

Native API returns: -999 (Unknown OpenCL error code) -999 (Unknown OpenCL error code)Exception caught at file:latency.cpp, line:500

 How 2 fix it? Thx

0 Kudos
1 Solution
NoorjahanSk_Intel
Moderator
1,686 Views

Hi,


>> Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?


In Gen9 compute architecture, the sampler/images includes L1 and L2 cache Miss and buffers include L3 cache miss.

You can get the L1 and L2 miss using the Vtune profiler if you use sycl images/samplers in your application.


Please refer to the below link for more details:

https://www.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf#page=10


Thanks & Regards,

Noorjahan.



View solution in original post

8 Replies
NoorjahanSk_Intel
Moderator
1,845 Views

Hi,


Thanks for reaching out to us.


we can measure the actual kernel execution time on the device using DPC++ built-in profiling API. 


Please refer to the below link for more details:


https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top/kernels/kernel-launch.html


Please get back to us if you face any issues with a sample reproducer.


Thanks & Regards,

Noorjahan.


0 Kudos
PcDack1
New Contributor I
1,827 Views

Hi Noorjahan,

 

Thanks for your answer, I would like to use this method to get Memory Latency. so using profiler tool seems not good to do it. Is there a tool in oneAPI to get Memory Latency?

 

Thanks & Regards,

Dack.

0 Kudos
NoorjahanSk_Intel
Moderator
1,790 Views

Hi,


You can try using the Intel® Memory Latency Checker v3.9a tool to get memory latency.


Please refer to the below link for more details regarding MLC tool.

https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-memory-latency-checker.html


You can also use the Vrune profiler tool which you get by downloading oneAPI Basetoolkit to get memory latency.


Please refer to the below link for more details:

https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/cpu-metrics-reference/memory-bound/dram-bound/memory-latency.html


Please let us know whether this tool matches your use case.


Thanks & Regards,

Noorjahan.


0 Kudos
PcDack1
New Contributor I
1,779 Views

Hi Noorjahan,

 

Thank you for your answer. First, I think the Intel® Memory Latency Checker v3.9a tool only works on the CPU, and I now want to test the GPU's Memory Latency. Secondly, the Vtune profiler tool can only get the L3 Miss of the GPU, I can't get the L1 and L2 Miss. Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?

 

Thanks & Regards,

Dack.

0 Kudos
NoorjahanSk_Intel
Moderator
1,742 Views

Hi,


We are checking with the concerned team and we will get back to you soon.


Thanks & Regards,

Noorjahan.


0 Kudos
NoorjahanSk_Intel
Moderator
1,687 Views

Hi,


>> Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?


In Gen9 compute architecture, the sampler/images includes L1 and L2 cache Miss and buffers include L3 cache miss.

You can get the L1 and L2 miss using the Vtune profiler if you use sycl images/samplers in your application.


Please refer to the below link for more details:

https://www.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf#page=10


Thanks & Regards,

Noorjahan.



NoorjahanSk_Intel
Moderator
1,667 Views

Hi,


We haven't heard back from you. Could you please provide an update on your issue?


Thanks & Regards,

Noorjahan


0 Kudos
NoorjahanSk_Intel
Moderator
1,654 Views

Hi,


Thanks for accepting our solution.


As this issue has been resolved, we will no longer respond to this thread. If you need any additional information, please submit a new question.



Thanks & Regards,

Noorjahan.


0 Kudos
Reply