Solved: Measuring Time inside kernel on intel GPU(P630)

PcDack1 · ‎05-16-2022

I have fair amount of experience on GPU programming using CUDA. I used to use clock function inside CUDA kernel (as mentioned in here) to measure ticks of certain operations inside the kernel. I wrote a simple oneAPI kernel and tried to run it on the intel integrated GPU. Errors will be generated.

__kernel void testVecAdd(__global const int *a,__global const int *b,__global int *c,
                 __global float *t){

clock_t start = clock();

 int gid = get_global_id(0);
 c[gid] = a[gid] + b[gid];

 t[gid] = (float)(clock()-start)/CLOCKS_PER_SEC;

}

error info:

Native API returns: -999 (Unknown OpenCL error code) -999 (Unknown OpenCL error code)Exception caught at file:latency.cpp, line:500

How 2 fix it? Thx

NoorjahanSk_Intel · ‎06-17-2022

Hi,

>> Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?

In Gen9 compute architecture, the sampler/images includes L1 and L2 cache Miss and buffers include L3 cache miss.

You can get the L1 and L2 miss using the Vtune profiler if you use sycl images/samplers in your application.

Please refer to the below link for more details:

https://www.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf#page=10

Thanks & Regards,

Noorjahan.

View solution in original post

NoorjahanSk_Intel · ‎05-18-2022

Hi,

Thanks for reaching out to us.

we can measure the actual kernel execution time on the device using DPC++ built-in profiling API.

Please refer to the below link for more details:

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top/kernels/kernel-launch.html

Please get back to us if you face any issues with a sample reproducer.

Thanks & Regards,

Noorjahan.

PcDack1 · ‎05-18-2022

Hi Noorjahan,

Thanks for your answer, I would like to use this method to get Memory Latency. so using profiler tool seems not good to do it. Is there a tool in oneAPI to get Memory Latency?

Thanks & Regards,

Dack.

NoorjahanSk_Intel · ‎05-24-2022

Hi,

You can try using the Intel® Memory Latency Checker v3.9a tool to get memory latency.

Please refer to the below link for more details regarding MLC tool.

https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-memory-latency-checker.html

You can also use the Vrune profiler tool which you get by downloading oneAPI Basetoolkit to get memory latency.

Please refer to the below link for more details:

https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/cpu-metrics-reference/memory-bound/dram-bound/memory-latency.html

Please let us know whether this tool matches your use case.

Thanks & Regards,

Noorjahan.

PcDack1 · ‎05-25-2022

Hi Noorjahan,

Thank you for your answer. First, I think the Intel® Memory Latency Checker v3.9a tool only works on the CPU, and I now want to test the GPU's Memory Latency. Secondly, the Vtune profiler tool can only get the L3 Miss of the GPU, I can't get the L1 and L2 Miss. Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?

Thanks & Regards,

Dack.

NoorjahanSk_Intel · ‎06-06-2022

Hi,

We are checking with the concerned team and we will get back to you soon.

Thanks & Regards,

Noorjahan.

NoorjahanSk_Intel · ‎06-17-2022

Hi,

>> Does the tool exist for me to get GPUs(HD serial) L1 and L2 Miss?

In Gen9 compute architecture, the sampler/images includes L1 and L2 cache Miss and buffers include L3 cache miss.

You can get the L1 and L2 miss using the Vtune profiler if you use sycl images/samplers in your application.

Please refer to the below link for more details:

https://www.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf#page=10

Thanks & Regards,

Noorjahan.

NoorjahanSk_Intel · ‎06-27-2022

Hi,

We haven't heard back from you. Could you please provide an update on your issue?

Thanks & Regards,

Noorjahan

NoorjahanSk_Intel · ‎06-27-2022

Hi,

Thanks for accepting our solution.

As this issue has been resolved, we will no longer respond to this thread. If you need any additional information, please submit a new question.

Thanks & Regards,

Noorjahan.