Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
230 Views

Modification of gen assembly and querying the time stamp register

Jump to solution

Hello

I am using OpenCL on Gen9.5 architecture and I am using i7-7700k processor. I want to have something similar to clock function inside my kernel. What I want to do is something similar to

__kernel void TimeMeasurementKernel(){
  //get current clock value
  t0 = clock();
	//someoperations
  t1 = clock();
  time_elapsed_to_execute_operations = t1 - t0;
}

But it seems that intel doesn't provide any means to get the clock value from inside the OpenCL kernel. But there is a timestamp architecture register tm0 as mentioned in here. My goal is to query this timestamp register from inside the kernel. So one of the way that I thought of, if possible, then to generate the .gen assembly file first by using ioc64 offline compiler using the -asm option, modify the assembly file by introducing the timestamp register opcode and then generate the binary from the modified assembly. I would be able to load the modified binary through the  clCreateProgramWithBinary. I am not sure if this is possible. But if it is doable then I would very much appreciate if someone could provide some sort of working example as to how I can do this, specially introducing the timestamp register by modifying the generated assembly from ioc64 and then creating the binary file from the modified assembly. 

Also as I have stated my goal is to read the timestamp register, if this similar thing can be achieved by any other method then that would do as well. All I want to gather is the execution time of a portion of my code during runtime. Please let me know if any further information is required. Thank you.

0 Kudos
24 Replies
Highlighted
Beginner
38 Views

@ben ashbaugh 

Thank you so much for clearing up the usage methodology. Your comment was very clear and really helpful. At the very beginning let me mention that I have privilege access and I have tried what you have mentioned but there is a JIT compilation error that is showing up. Let me give you some insight on what I am doing. So I have download the IGC from source and since the current version was giving (which made me to post the issue on the first place), I changed to tag igc_release_2018-12-12 which compiled and installed fine. On the other hand, I have also followed the ubuntu installation procedure of 19.11.12599 release version of opencl runtime and installed all the debian packages.

Now there are 2 approaches that I have taken as I have discussed below:

1. The first approach is to use the libigc from the IGC  that I built:

So as per your comment, I have copied the libigc.so from my igc built (which is in dump64/igc/ directory )to /usr/lib and /usr/local/lib. After copying, when I am trying to execute my opencl program using clCreateProgramWithSource() then I am getting JIT compilation error. However, there are no details about the error that could be obtained by clGetProgramBuildInfo. 

This is the outpur from clGetProgramBuildInfo 

==========ERROR=========

=======================
2. The second approach is to install the debian files in opencl runtime release and use the prepackaged igc.

So when I am using the ubuntu installation procedure after downloading the debian packages (specially intel-igc-core_19.11.1622_amd64.deb) as mentioned in here (sudo dpkg -i intel-igc-core_19.11.1622_amd64.deb), my opencl program got compiled without any error and ran fine (without having the overloaded function inside opencl kernel). But if I include the overloaded functions as mentioned below

ulong __attribute__((overloadable)) intel_get_cycle_counter( void );
__kernel void vec_add(__global int *out, __global const int *in1, __global const int *in2) {
  int i = get_global_id(0);
  ulong dst =  intel_get_cycle_counter();
  out = in1 + in2;
}

Then the error in JIT is 

unknown mangling!
UNREACHABLE executed at /home/duttasankha/Desktop/SANKHA_ALL/INTEL_GRAPHICS_COMPILER/llvm_source/projects/llvm-spirv/lib/SPIRV/OCLUtil.cpp:178!
Aborted (core dumped)

The reason for this error is understandable as here the IGC that came with the debian package is installed, this error is showing which would not be there if I use the IGC that I built. 

But again if I copy the libigc.so to /usr/lib or do sudo make install from the build folder from IGC workspace folder then it is not compiling. So I can see that whenever I am trying to use the IGC that I built is throwing a JIT compilation error, but when I am using the debian packages, it compiles file. I am not sure what is the issue in here that when I am using the libigc from my built is failing but using the debian packages from the opencl release is successful. It would be very much helpful if you could provide me some sort of idea as for the reason of this. Thank you.

 

0 Kudos
Highlighted
Employee
236 Views

Dutta, Sankha wrote:

I changed to tag igc_release_2018-12-12 which compiled and installed fine. On the other hand, I have also followed the ubuntu installation procedure of 19.11.12599 release version of opencl runtime and installed all the debian packages.

...

when I am trying to execute my opencl program using clCreateProgramWithSource() then I am getting JIT compilation error.

My guess is that the compiler from December is incompatible with the driver from March and the incompatibility is being flagged as a JIT compilation error.  Since it sounds like you're using an older compiler due to compilation errors with newer code, could you please try a corresponding older driver?

To be on the safe side, try to get as close to December 12th as you can, for example:

https://github.com/intel/compute-runtime/releases/tag/18.51.12049

It looks like any driver older than February 19th may work, however:

https://github.com/intel/intel-graphics-compiler/commits/master/IGC/AdaptorOCL/ocl_igc_shared/execut...

View solution in original post

0 Kudos
Highlighted
Beginner
38 Views

@Ben Ashbaugh

Thank you so much for your reply. That did the job. That was really really helpful and I cannot thank you enough. Now I am able to use the overloaded functions without any compilation error. I have few follow up questions though which are as follows.

1. There is an overloaded function intel_get_eu_id which I am guessing is used to get the EU id. My GPU is gen9 which has got only 8 EUs/subslice. However, I can see EU id values of like 8, 9, 10 and 11 as I thought it should be up to 7 (0 - 7). So I am confused how to interpret this value and was wondering if you could help me with that.

2. What is the difference between  intel_get_eu_idintel_get_eu_thread_id and intel_get_hw_thread_id?

3. How can I view the .gen file of the kernel ISA that would include this overloaded functions as well. I would need to make sure that the cycle value that I am getting is indeed through tm0 register.

4. So In the Ibif_impl.cl file there are some more overloaded functions that I am intending to use if it comes out to be the same as I think it is. I was wondering if you could give me some idea about the purpose of the following overloaded functions and how can I use them.

__attribute__((always_inline)) int OVERLOADABLE intel_get_active_channel_mask( void )
__attribute__((always_inline)) uint OVERLOADABLE intel_set_dbg_register(uint dbg0_0)
__attribute__((always_inline)) uint OVERLOADABLE intel_get_grf_register( uint value )
__attribute__((always_inline)) uint OVERLOADABLE intel_get_flag_register( uint flag )
__attribute__((always_inline)) uint OVERLOADABLE intel_get_control_register

Thank you again for for all the help.

0 Kudos
Highlighted
Employee
38 Views

Dutta, Sankha wrote:

Now I am able to use the overloaded functions without any compilation error.

Excellent, very happy to hear it worked!

Dutta, Sankha wrote:

I have few follow up questions though which are as follows.

1. There is an overloaded function intel_get_eu_id which I am guessing is used to get the EU id. My GPU is gen9 which has got only 8 EUs/subslice. However, I can see EU id values of like 8, 9, 10 and 11 as I thought it should be up to 7 (0 - 7). So I am confused how to interpret this value and was wondering if you could help me with that.

2. What is the difference between  intel_get_eu_idintel_get_eu_thread_id and intel_get_hw_thread_id?

These values come straight from the HW "state registers".  For some of these IDs, they're guaranteed to be unique, but they may not necessarily be contiguous.  There's a bit more information in the state register description in the programmer's reference manual:

https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf (around page 750)

Dutta, Sankha wrote:

3. How can I view the .gen file of the kernel ISA that would include this overloaded functions as well. I would need to make sure that the cycle value that I am getting is indeed through tm0 register.

Can you use the Kernel ISA feature from the Intercept Layer?

https://github.com/intel/opencl-intercept-layer/blob/master/docs/kernel_isa_gpu.md

Dutta, Sankha wrote:

4. So In the Ibif_impl.cl file there are some more overloaded functions that I am intending to use if it comes out to be the same as I think it is. I was wondering if you could give me some idea about the purpose of the following overloaded functions and how can I use them.

We mostly added these to help with debugging.  They map more-or-less directly to the EU HW registers.  As a side note, I recall that get_grf_register and get_flag_register may not have been implemented, so I'd advise checking before relying on them.

0 Kudos
Highlighted
Beginner
38 Views

@Ben Ashbaugh

Thank you so much for all the details. I will close the issue in IGC git repo as well. Thank you again. 

0 Kudos