OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.

Modification of gen assembly and querying the time stamp register

Dutta__Sankha
Beginner
2,940 Views

Hello

I am using OpenCL on Gen9.5 architecture and I am using i7-7700k processor. I want to have something similar to clock function inside my kernel. What I want to do is something similar to

__kernel void TimeMeasurementKernel(){
  //get current clock value
  t0 = clock();
	//someoperations
  t1 = clock();
  time_elapsed_to_execute_operations = t1 - t0;
}

But it seems that intel doesn't provide any means to get the clock value from inside the OpenCL kernel. But there is a timestamp architecture register tm0 as mentioned in here. My goal is to query this timestamp register from inside the kernel. So one of the way that I thought of, if possible, then to generate the .gen assembly file first by using ioc64 offline compiler using the -asm option, modify the assembly file by introducing the timestamp register opcode and then generate the binary from the modified assembly. I would be able to load the modified binary through the  clCreateProgramWithBinary. I am not sure if this is possible. But if it is doable then I would very much appreciate if someone could provide some sort of working example as to how I can do this, specially introducing the timestamp register by modifying the generated assembly from ioc64 and then creating the binary file from the modified assembly. 

Also as I have stated my goal is to read the timestamp register, if this similar thing can be achieved by any other method then that would do as well. All I want to gather is the execution time of a portion of my code during runtime. Please let me know if any further information is required. Thank you.

0 Kudos
1 Solution
Ben_A_Intel
Employee
2,946 Views

Dutta, Sankha wrote:

I changed to tag igc_release_2018-12-12 which compiled and installed fine. On the other hand, I have also followed the ubuntu installation procedure of 19.11.12599 release version of opencl runtime and installed all the debian packages.

...

when I am trying to execute my opencl program using clCreateProgramWithSource() then I am getting JIT compilation error.

My guess is that the compiler from December is incompatible with the driver from March and the incompatibility is being flagged as a JIT compilation error.  Since it sounds like you're using an older compiler due to compilation errors with newer code, could you please try a corresponding older driver?

To be on the safe side, try to get as close to December 12th as you can, for example:

https://github.com/intel/compute-runtime/releases/tag/18.51.12049

It looks like any driver older than February 19th may work, however:

https://github.com/intel/intel-graphics-compiler/commits/master/IGC/AdaptorOCL/ocl_igc_shared/executable_format/patch_list.h

View solution in original post

0 Kudos
24 Replies
Ben_A_Intel
Employee
2,947 Views

Dutta, Sankha wrote:

I changed to tag igc_release_2018-12-12 which compiled and installed fine. On the other hand, I have also followed the ubuntu installation procedure of 19.11.12599 release version of opencl runtime and installed all the debian packages.

...

when I am trying to execute my opencl program using clCreateProgramWithSource() then I am getting JIT compilation error.

My guess is that the compiler from December is incompatible with the driver from March and the incompatibility is being flagged as a JIT compilation error.  Since it sounds like you're using an older compiler due to compilation errors with newer code, could you please try a corresponding older driver?

To be on the safe side, try to get as close to December 12th as you can, for example:

https://github.com/intel/compute-runtime/releases/tag/18.51.12049

It looks like any driver older than February 19th may work, however:

https://github.com/intel/intel-graphics-compiler/commits/master/IGC/AdaptorOCL/ocl_igc_shared/executable_format/patch_list.h

0 Kudos
Dutta__Sankha
Beginner
541 Views

@Ben Ashbaugh

Thank you so much for your reply. That did the job. That was really really helpful and I cannot thank you enough. Now I am able to use the overloaded functions without any compilation error. I have few follow up questions though which are as follows.

1. There is an overloaded function intel_get_eu_id which I am guessing is used to get the EU id. My GPU is gen9 which has got only 8 EUs/subslice. However, I can see EU id values of like 8, 9, 10 and 11 as I thought it should be up to 7 (0 - 7). So I am confused how to interpret this value and was wondering if you could help me with that.

2. What is the difference between  intel_get_eu_idintel_get_eu_thread_id and intel_get_hw_thread_id?

3. How can I view the .gen file of the kernel ISA that would include this overloaded functions as well. I would need to make sure that the cycle value that I am getting is indeed through tm0 register.

4. So In the Ibif_impl.cl file there are some more overloaded functions that I am intending to use if it comes out to be the same as I think it is. I was wondering if you could give me some idea about the purpose of the following overloaded functions and how can I use them.

__attribute__((always_inline)) int OVERLOADABLE intel_get_active_channel_mask( void )
__attribute__((always_inline)) uint OVERLOADABLE intel_set_dbg_register(uint dbg0_0)
__attribute__((always_inline)) uint OVERLOADABLE intel_get_grf_register( uint value )
__attribute__((always_inline)) uint OVERLOADABLE intel_get_flag_register( uint flag )
__attribute__((always_inline)) uint OVERLOADABLE intel_get_control_register

Thank you again for for all the help.

0 Kudos
Ben_A_Intel
Employee
541 Views

Dutta, Sankha wrote:

Now I am able to use the overloaded functions without any compilation error.

Excellent, very happy to hear it worked!

Dutta, Sankha wrote:

I have few follow up questions though which are as follows.

1. There is an overloaded function intel_get_eu_id which I am guessing is used to get the EU id. My GPU is gen9 which has got only 8 EUs/subslice. However, I can see EU id values of like 8, 9, 10 and 11 as I thought it should be up to 7 (0 - 7). So I am confused how to interpret this value and was wondering if you could help me with that.

2. What is the difference between  intel_get_eu_idintel_get_eu_thread_id and intel_get_hw_thread_id?

These values come straight from the HW "state registers".  For some of these IDs, they're guaranteed to be unique, but they may not necessarily be contiguous.  There's a bit more information in the state register description in the programmer's reference manual:

https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf (around page 750)

Dutta, Sankha wrote:

3. How can I view the .gen file of the kernel ISA that would include this overloaded functions as well. I would need to make sure that the cycle value that I am getting is indeed through tm0 register.

Can you use the Kernel ISA feature from the Intercept Layer?

https://github.com/intel/opencl-intercept-layer/blob/master/docs/kernel_isa_gpu.md

Dutta, Sankha wrote:

4. So In the Ibif_impl.cl file there are some more overloaded functions that I am intending to use if it comes out to be the same as I think it is. I was wondering if you could give me some idea about the purpose of the following overloaded functions and how can I use them.

We mostly added these to help with debugging.  They map more-or-less directly to the EU HW registers.  As a side note, I recall that get_grf_register and get_flag_register may not have been implemented, so I'd advise checking before relying on them.

0 Kudos
Dutta__Sankha
Beginner
541 Views

@Ben Ashbaugh

Thank you so much for all the details. I will close the issue in IGC git repo as well. Thank you again. 

0 Kudos
Reply