OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

Persistent thread on the CPU

Polar01
Beginner
147 Views
Hi,
I'm trying to implement some "persistent thread" on the CPU to batch a set of tasks, but I got some strange results.
I have put some "printf" in the following code. What is strange is that I see the "BEFORE" before the "START" !
I have a local barrier and so I should see "START" before !!
{
const size_t lid = get_local_id(0);
__local volatile int localPoolNextRay[1];
__local volatile int localPoolRayCount[1];
if (lid < 1){
localPoolNextRay[0] = localPoolRayCount[0] = 0;
printf("START %d %d : %d\\n", get_global_id(0), lid, localPoolNextRay[0]);
}
barrier(CLK_LOCAL_MEM_FENCE);
while(true)
{
// Local pool is empty
if (localPoolRayCount[0] < 1 && lid < 1)
{
localPoolNextRay[0] = atom_add(globalPoolNextRay, LOAD_BALANCER_BATCH_SIZE);
localPoolRayCount[0] = LOAD_BALANCER_BATCH_SIZE;
}
mem_fence(CLK_LOCAL_MEM_FENCE);
//barrier(CLK_LOCAL_MEM_FENCE);
printf("BEFORE %d %d : %d\\n", get_global_id(0), lid, localPoolNextRay[0]);
// Get rays from local pool
int myRayIndex = localPoolNextRay[0] + lid;
if (myRayIndex >= globalPoolRayCount)
return;
printf("AFTER %d\\n", myRayIndex);
mem_fence(CLK_LOCAL_MEM_FENCE);
if (lid < 1)
{
localPoolNextRay[0] += 32;
localPoolRayCount[0] -= 32;
//mem_fence(CLK_LOCAL_MEM_FENCE);
}
mem_fence(CLK_LOCAL_MEM_FENCE);
// Execute
trace(myRayIndex, tasks);
}
}
0 Kudos
4 Replies
Evgeny_F_Intel
Employee
147 Views
Hi,

How many workgroups you have?

Can you put small reproducer of this issue?

Thansk,
Evgeny
Polar01
Beginner
147 Views
Hi,
Global work = 704
Local work = 32
I'm able to run it correctly, but for this I have to put barrier everywhere ! In fact, because there is no "natural" SIMT behavior on the CPU I'm not sure that using persitent thread will help on the CPU ! (Lot of barriers or atomic functions !!).
Eli_Bendersky__Intel
147 Views
Hi Polar01,

Since you have serveral work-groups executing simultaneously, the 'BEFORE' printout could come from another work group (from a work item that has lid >= 1 and therefore doesn't print 'START')
Evgeny_F_Intel
Employee
147 Views

Good to here it's running well.

Reply