Weird behaviour of atomic functions

mheimel · ‎05-10-2012

Hi,

I tried implementing a simple lock/mutex using OpenCL atomic functions. The following kernel demonstrates the general idea:

__kernel void lock() {

volatile __local int mutex;

// Every thread waits until it gets the lock.

int done = 0;

while (!done) {

// Try to get the lock

int lock = !atomic_cmpxchg(&(mutex), 0, 1);

if (lock) {

done = 1;

// Release the lock

atomic_xchg(&(mutex), 0);

}

return;

}

On my Nvidia GPU, the kernel works fine. However, on the latest Intel OCL SDK (version 2.0.31360.31426 on x64 Linux, running on Xeon CPU) the kernel seems to deadlock. Am I missing something here?

Max

Evgeny_F_Intel · ‎05-10-2012

You made an assumption of parallel execution of work items. However, it's not a always true.
The appropriate way to synchronize inside a work group is barrier() built-in.
Thanks,
Evgeny