Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
1,066 Views

nested loop error in OpenCL hardware run

Hello, 

 

My openCL project produced wrong output in hardware run, and after several days debug, I finally found out the reason. The index in one of my nested loop structure is messed up and I really don't why this would ever happen. I have a nest loop structure like this: 

 

****----------------------------------------------------------------------**** 

 

__kernel 

void some_kernel() { 

........ 

........ 

for (uint m = 0; m < m_bound; m++) { 

for(uint n = 0; n < n_bound; n++) { 

........ 

printf("m = %d, n = %d", m, n); 

 

........ 

........ 

****-------------------------------------------------------------------------**** 

 

Of course the actual code is much more complicated than this simplified version. At the the beginning of this nested loop structure, m got incremented before n reaches its loop bound!! the output is like this: 

 

****-------------------------------------------------------------------------**** 

 

m = 0, n = 0 

m = 1, n = 0 

m = 0, n = 1 

m = 1, n = 1 

m = 0, n = 2 

m = 1, n = 2 

............. 

............. 

 

****-----------------------------------------------------------------------**** 

 

And this only happens at the first few iterations(Only when m is 0 or 1). This is extremly weird, since in simulation it worked perfectly. 

 

Any advice would be greatly appreciated!! 

 

Best regards, 

Lancer
0 Kudos
3 Replies
Altera_Forum
Honored Contributor I
66 Views

Assuming this is a single work-item kernel, oop pipelining, performed by aoc, will launch the next iteration of the outer loop, and subsequently the inner loop as soon as it can if there are no dependencies. That seems to be the case here.

Altera_Forum
Honored Contributor I
66 Views

Thanks for your reply! Do you have any idea how can I remove such pipelining?

Altera_Forum
Honored Contributor I
66 Views

And I believe I have loop dependency. In the inner most loop (controlled by n), I assigned m to a variable.

Reply