Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16595 Discussions

Hardware not working as intended

Altera_Forum
Honored Contributor II
1,060 Views

Hi all, 

 

I am stuck with a problem since many days and I am unable to debug.  

Any help is appreciated. 

 

I have a OpenCL FPGA kernel which is a part of larger program. 

it has something like  

if(tid <numVertices{ 

do-some floating point ops 

 

In the host code i want to call the function in a loop after setting arguments; 

for(int i=0;i<100;i++){ 

// some operations 

 

err = clEnqueueNDRangeKernel(command_queue, kernel[3], 1, NULL, 

workSize_4, localWorkSize_4, 0, NULL, &clContext.events[1]); 

clFinish(command_queue);  

ExitError(checkErr(err, "Failed to execute kernel4!")); 

 

//some more operations 

 

--------------- 

Everything works fine (as intended) in emulation and also on CPU.  

The problem comes when I execute it on Altera FPGA (both Intel Harp and Nallatech) with the generated respective aocx files 

When the numVertices is "even"(multiple of 2), the code exits normally and the output is as intended, but when numVertices is "odd" the process enters into sleep state and stays there forever. No error is shown. 

When I try to debug using GDB, SIG INT 44 is shown. I tried to use pstack but was not able to find the exact problem. 

 

 

Thank you. 

 

Regards, 

Sharat 

 

 

 

The kernel (not yest optimized) which fails to execute is as shown below: 

----------------------------# define KNOB_COMPUTE_UNITS_1 1# define KNOB_SIMD_1 1# define KNOB_NUM_WORK_ITEMS_1 256 

/* kernel 4*/ # ifdef ALTERA_CL 

__attribute__ ((reqd_work_group_size(KNOB_NUM_WORK_ITEMS_1,1,1))) 

__attribute__ ((num_simd_work_items(KNOB_SIMD_1))) 

__attribute__ ((num_compute_units(KNOB_COMPUTE_UNITS_1)))# endif 

__kernel void kernel4(__global double* restrict newpr, 

__global double* restrict w, 

unsigned int numVertices 

) { //numVertices private variable 

__private size_t tid = get_global_id(0); 

 

if(tid< numVertices){  

newpr[tid] += w[0] * (1/(double)numVertices);  

}
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
269 Views

Comparing "size_t" with "unsigned int" could cause problems since the size of "size_t" is implementation-dependent. Though if that was to cause problems, I would also expect it to show up in emulation. Either way, you can try defining both variables in the same way, or casting one to another and see if it fixes your problem. Still, I have never seen a standard kernel with no channels deadlock. You can also try different versions of Altera's compiler to see if you get different behavior. If you could still reproduce the problem with the latest version of the compiler, I recommend reporting it to Altera; it might as well be a compiler bug.

0 Kudos
Altera_Forum
Honored Contributor II
269 Views

Thanks HRZ, 

 

I changed the size_t data type, however the behavior didn't change. I am not sure where the problem lies. I have not tried the various altera compilers though. I will try. 

 

Thanks
0 Kudos
Reply