Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel ICX Compiler , Intel® DPC++ Compatibility Tool, and GDB*
584 Discussions

Measuring Time inside kernel on intel GPU(P630)

PcDack1
New Contributor I
1,088 Views

I have fair amount of experience on GPU programming using CUDA. I used to use clock function inside CUDA kernel (as mentioned in here) to measure ticks of certain operations inside the kernel. I wrote a simple oneAPI kernel and tried to run it on the intel integrated GPU. Errors will be generated.

__kernel void testVecAdd(__global const int *a,__global const int *b,__global int *c,
                 __global float *t){

clock_t start = clock();

 int gid = get_global_id(0);
 c[gid] = a[gid] + b[gid];

 t[gid] = (float)(clock()-start)/CLOCKS_PER_SEC;

}

error info:

Native API returns: -999 (Unknown OpenCL error code) -999 (Unknown OpenCL error code)Exception caught at file:latency.cpp, line:500

 How 2 fix it? Thx

0 Kudos
1 Solution
NoorjahanSk_Intel
Moderator
894 Views

Hi,


>> Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?


In Gen9 compute architecture, the sampler/images includes L1 and L2 cache Miss and buffers include L3 cache miss.

You can get the L1 and L2 miss using the Vtune profiler if you use sycl images/samplers in your application.


Please refer to the below link for more details:

https://www.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf#page=10


Thanks & Regards,

Noorjahan.



View solution in original post

8 Replies
NoorjahanSk_Intel
Moderator
1,053 Views

Hi,


Thanks for reaching out to us.


we can measure the actual kernel execution time on the device using DPC++ built-in profiling API. 


Please refer to the below link for more details:


https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top/kernels/kernel-launch.html


Please get back to us if you face any issues with a sample reproducer.


Thanks & Regards,

Noorjahan.


0 Kudos
PcDack1
New Contributor I
1,035 Views

Hi Noorjahan,

 

Thanks for your answer, I would like to use this method to get Memory Latency. so using profiler tool seems not good to do it. Is there a tool in oneAPI to get Memory Latency?

 

Thanks & Regards,

Dack.

0 Kudos
NoorjahanSk_Intel
Moderator
998 Views

Hi,


You can try using the Intel® Memory Latency Checker v3.9a tool to get memory latency.


Please refer to the below link for more details regarding MLC tool.

https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-memory-latency-checker.html


You can also use the Vrune profiler tool which you get by downloading oneAPI Basetoolkit to get memory latency.


Please refer to the below link for more details:

https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/cpu-metrics-reference/memory-bound/dram-bound/memory-latency.html


Please let us know whether this tool matches your use case.


Thanks & Regards,

Noorjahan.


0 Kudos
PcDack1
New Contributor I
987 Views

Hi Noorjahan,

 

Thank you for your answer. First, I think the Intel® Memory Latency Checker v3.9a tool only works on the CPU, and I now want to test the GPU's Memory Latency. Secondly, the Vtune profiler tool can only get the L3 Miss of the GPU, I can't get the L1 and L2 Miss. Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?

 

Thanks & Regards,

Dack.

0 Kudos
NoorjahanSk_Intel
Moderator
950 Views

Hi,


We are checking with the concerned team and we will get back to you soon.


Thanks & Regards,

Noorjahan.


0 Kudos
NoorjahanSk_Intel
Moderator
895 Views

Hi,


>> Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?


In Gen9 compute architecture, the sampler/images includes L1 and L2 cache Miss and buffers include L3 cache miss.

You can get the L1 and L2 miss using the Vtune profiler if you use sycl images/samplers in your application.


Please refer to the below link for more details:

https://www.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf#page=10


Thanks & Regards,

Noorjahan.



NoorjahanSk_Intel
Moderator
875 Views

Hi,


We haven't heard back from you. Could you please provide an update on your issue?


Thanks & Regards,

Noorjahan


0 Kudos
NoorjahanSk_Intel
Moderator
862 Views

Hi,


Thanks for accepting our solution.


As this issue has been resolved, we will no longer respond to this thread. If you need any additional information, please submit a new question.



Thanks & Regards,

Noorjahan.


0 Kudos
Reply