- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The code run well by clCreateProgramWithSource, BUT dealock by clCreateProgramWithBinary. The size of work group is (8,8), I reproduce the deadlock on Intel Graphics win8.1+ HD4600/530 and win10+HD630.
local int SumAlpha[16*16/4]; int index = 1*get_local_id(0)+ 16/2*get_local_id(1); ...... SumAlpha[index] = sum_alpha; barrier(CLK_LOCAL_MEM_FENCE); // do reduction in shared mem for(int i = (16*16/8); i > 0; i >>= 1) { if(index < i) { SumAlpha[index] += SumAlpha[index + i]; } barrier(CLK_LOCAL_MEM_FENCE); // unroll loop show deadlock when i==8 }
the cmd which create spir is
"..\vendors\OpenCL\Intel\bin\ioc32.exe -cmd=build -input="xxx" -device=gpu -spir32="FileName%" -bo="-cl-std=CL1.2 -I %ProjetcDir%\cl"
And I tried '-cl-opt-disable', same deadlock. I tried below equivalence, also deadlock.
a = SumAlpha[index]; b = SumAlpha[index + i]; SumAlpha[index] = a+ b;
I can use instruction dot to avoid reduction, while I want to know more. Did I miss something? Or is there any way to dump out the gen assembly of both way to compare their difference? Greatly Thanks.
Link Copied
0 Replies
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page