Compiler Warning: Threads may reach barrier out of order

Altera_Forum · ‎08-16-2017

Hi all,

I am writing a matrix vector multiplication kernel code. And I am using two barrier(CLK_LOCAL_MEM_FENCE) in my kernel function. My code works on FPGAs.

However, when I compile my kernel, there is always a Compiler Warning: Threads may reach barrier out of order - allowing at most 2 concurrent workgroups.

I found that if I deleted a for loop in my kernel , the warning will disappear. And if I add the similar for loop to other kernels like vector reduction, the compiler warning appears. The for loop looks like this:

for(uint x = get_local_id(0); x < width; x += get_local_size(0)){

running_sum += row[x] * vector[x];

}

The code inside the loop doesn't matter. Even though there's nothing inside the for loop, the compiler warning still appears.

So how to solve the problem? :)

Altera_Forum · ‎08-17-2017

Since the header of your loop depends on local id and each work-item in the work-group will traverse the loop for a different number of iterations, the compiler cannot determine the order at which the work-items exit the loop and reach the barrier after it at compile time and hence, gives you that warning. You don't have to fix this warning, but if you want to, you should write your code in a way that every work-item does the same number of operations before reaching the next barrier.