GPU Compute Software
Ask questions about Intel® Graphics Compute software technologies, such as OpenCL* GPU driver and oneAPI Level Zero
74 Discussions

Driver problem with OpenCL kernel

Jinchuan_Tang
Beginner
1,821 Views

Dear Intel OpenCL GPU team,

I would like to ask for your help on using your OpenCL driver with Xe iGPU (Intel Core i7 1165G7). 

I was a tiny contributor to octave-ocl with some of my testing and coding as well as my own fork page (https://sourceforge.net/u/tangjinchuan/octave-ocl-gzu/ci/default/tree/). The original octave-ocl project held by Matt is trying to bring gpuArray support to Matlab's open-source alternative -- Octave. 

About a year ago, I found out that Intel GPU drivers (on 7200U CPU with an iGPU) will produce wrong results with octave-ocl release-1.1.0 when trying to invert a gpuArray in Octave when it calls, this did not happened with Nvidia's GPU. And I reported this problem to Matt who later added a workaround as shown in https://sourceforge.net/p/octave-ocl/code/ci/aeb2f8f98555564f896e4d8fb1bf979eb5fc0397/. In the fixed code after 1.1.1, it replaced % operations to a >> operation in the switch statement with different parameters which will make the compare kernel work in a correct way. 

But today, when I tried to use the older version 1.1.0 which contains the problem on both Intel CPU and GPU OpenCL drivers with the believe that the driver may have improved, I found out that the CPU part could produce the results without the workaround, and the GPU driver still has the problem. So I would like to know is there something wrong with the OpenCL GPU driver that forbid us to use the previous compare kernel as shown in the workaround?

 

Best wishes,

Jinchuan Tang

To produce the problem,

1. please download octave-ocl release-1.1.0 from https://sourceforge.net/projects/octave-ocl/files/. and GNU Octave.

2. Then pkg install ocl-1.1.0.tar.gz in the Octave command line when in the download workspace. like:

>>pkg install ocl-1.1.0.tar.gz 

3. 

>> pkg load ocl % load octave package 
>> ocl_context('device_selection','GPU0') % select GPU0 as the OpenCL working device for octave ocl.
You can use ocl_context('get_resources')  cmd to see all the available devices and GPUn where n is the index of corresponding GPU device when existed multiple GPUs.
 

Then, the problem is when use an Intel GPU device to invert a matrix, it produces wrong results where it strangely only inverts element 1 to 0, and not 0 to 1

>> pkg load ocl
>> ocl_context('device_selection','GPU0')
>> B = [1 1 0; 1 0 1; 1 0 0]
 
B =
 
  1 1 0
  1 0 1
  1 0 0
 
>> C=gpuArray(B)
 
C = 2-dimensional OCL array (3x3) of class double (double)
>> D = single(B);
 
>> E = gpuArray(D);
>> gather(E)
ans =
 
  1 1 0
  1 0 1
  1 0 0
 
>> !E
ans = 2-dimensional OCL array (3x3) of class single (float)
>> gather(ans)
ans =
 
  0 0 0
  0 0 0
  0 0 0
 

>>

 

 

 

0 Kudos
10 Replies
NoorjahanSk_Intel
Moderator
1,773 Views

Hi,

Thanks for reaching out to us.


We are also able to reproduce the issue from our end with OpenCL.


Could you please try with level_zero and do let us know if you still face the same issue?

Please provide the steps that you tried to use level_zero.


Thanks & Regards,

Noorjahan.


Jinchuan_Tang
Beginner
1,704 Views

Dear Noorjahan,

many thanks for your recommendation on Level zero API! 

It will take time for us to learn a new API and we are currently interested in OpenCL only. The project chose version 1.1 so that it could embrace more manufacturers and new players. I do not have resources for factories' own APIs like metal, CUDA, HIP etc unless it has been fully embraced by other factories. We know Apple had dropped the support for OpenCL. However, it's M1 MacOS still supports OpenCL 1.2, and I really don't worry about Apple's plan for it has almost none HPC market. Futhermore, I can share with you that the potential computing APIs in the future Chinese market could be OpenCL when the Chinese GPU makers come to play regardless of whether they are home-made corp. or IPs from companies like Imagination Tech. or others.

Best wishes,

Jinchuan

Ben_A_Intel
Employee
1,648 Views

Hi Jinchuan, thanks for reaching out!  I've also been able to reproduce your issue and I've passed it along to the compiler team for additional debug.

The issue seems to be related to the second switch statement:

  switch (fcn / 10) {                                        
    case 0: res = (o1 && o2); break;                         
    case 1: res = (o1 || o2); break;                         
    case 2: res = (!o1); break;                              
  }  

Our GPUs do not have a native 64-bit divide and it looks like there's an interaction between the emulated 64-bit divide and the switch statement.  If it's helpful, casting to a 32-bit value before dividing (switch ((uint)fcn / 10)) seems to generate the correct results.

Jinchuan_Tang
Beginner
1,627 Views

Dear Ben, 

thank you very much for sharing this important info. I wish your compiler team will manage to fix the emulator someday.

We will be carefull with future implementations. Previously, I remembered i5 7200U do have FP64 extension, but it still produced wrong results. I do realize since gen 11, your iGPU does not have OpenCL's FP64 extension anymore. This is pity since I bought loads of 11gen and 12gen PCs for my lab from last year till now. Recently, I also read news about your new ARC descrete cards will not support FP64, I don't know if this is true? I have plans to purchase future computing euipments for my lab. By the way, I do notice that our AMD's APUs based PCs like 5700G has FP64 extension and its double precision performs OK. Nvidia's 3800/3900 seems have no such problem. On the other hand, Apple's M1 also does not have FP64 extension.

 

Best wishes,

Jinchuan

Ben_A_Intel
Employee
1,561 Views

Hi Jinchuan, it appears that this issue is already fixed in our very latest drivers - always nice when that happens!

Can you please try with:

https://github.com/intel/compute-runtime/releases/tag/22.09.22577

The previous release 22.08.22549 may be OK too, but 22.05.22297 is too old.

Thanks!

Jinchuan_Tang
Beginner
1,544 Views

Dear Ben,

This is fantastic! You guys increase one customer's loyalty!

I will try it soon. In the meantime, could you please also push the Windows driver team to have the latest OpenCL driver in their Graphic driver. My 1165G7 with the latest graphic driver - 30.0.101.1340 DCH / Windows 11 64bit (Released on Feb 03 2022) still has an OpenCL runtime with version 2021.13.11.0.23_160000.

Thank you very much!

 

Best wishes,

Jinchuan

Jinchuan_Tang
Beginner
1,388 Views

Dear Hogan,

Thanks for your beautiful polishment. As an engineer/computer scientist, please forgive my English because I am the guy who has to supervise the postgraduate students during the days while has to figure out problems in the opensource project and others often in the late evening untill 2 am, I really have no temperament/energy to care about anything but any problem itself. 

I remember I went to Shakespeare's hometown in the UK with my long-time admiration, and I admired those people who could use a language beautifully. Interestingly, in Shakespeare's hometown, there is a Peony Pavilion which was donated by the Chinese people from the hometown of Tang Xianzu, who was called China's Shakespeare. We inherented the same ancient surname Tang (汤), so I guess my genes for languages have given their ways to contributing to scientific problems that could bring better live conditions and understandings to humanity.

Best wishes,

Jinchuan Tang

NoorjahanSk_Intel
Moderator
1,270 Views

Hi,


>>could you please also push the Windows driver team to have the latest OpenCL driver in their Graphic driver.


It’s the same driver codebase so the fix will be part of the Windows driver eventually.

As of now, we do not have a good estimate of when the fix will make it into a production Windows release.


As your issue is resolved, could you please confirm whether can we go ahead and close this issue?



Thanks & Regards,

Noorjahan.


Jinchuan_Tang
Beginner
1,207 Views
Dear all,
Sure, many thanks! Please close it as solved.
Good luck to your coming ARC products!
Best wishes,
Jinchuan
NoorjahanSk_Intel
Moderator
1,158 Views

Hi,


Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.



Thanks & Regards,

Noorjahan


Reply