- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm attempting to implement a simple barrel shifter. I'm able to get it to work using a task based approach but am unable to do so using an NDRange based kernel. I've posted both kernels below. I'd appreciate some insight into how the implementation of the NDRange approach differs from the task.
__kernel __attribute__((reqd_work_group_size(360,1,1))) __attribute__((max_work_group_size(360))) void barrelShifter_nd( __global unsigned char * restrict dataIn, const uint shift, __global unsigned char * restrict dataOut ) { char __attribute((register)) din[720]; uint gid = get_global_id(0); din[gid] = dataIn[gid] & 0x1; din[360+gid] = dataIn[gid] & 0x1; dataOut[gid] = din[shift+gid]; } __kernel __attribute__((task)) void barrelShifter_task( __global unsigned char * restrict dataIn, const uint shift, __global unsigned char * restrict dataOut ) { char __attribute((register)) din[720]; for(uint i=0; i<360; i++) { din = datain & 0x1; din[360+i] = dataIn & 0x1;}
# pragma unroll
for(uint i=0; i<360; i++)
{
dataout = din[shift+i]; } }
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is not possible to infer shift registers of any type in NDRange kernels. As per OpenCL's specification, there is no guarantee of thread ordering in this kernel type, and local memory consistency is only guaranteed at barriers. It is not possible to describe a shift register in NDRange kernels, while conforming to both of these conditions.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page