Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
15322 Discussions

Kernel Execution time significantly higher than Kernel profile time

Honored Contributor II

I am experiencing unexpected behavior when using clWaitForEvents or clFinish. Following is my structure of host code during kernel launch. 


for (i = 0; i<10;i++) { 


Set kernel Args 




clEnqueueNDRange(queue1, Kernel1, ....., &event1); 

clEnqueueTask(queue2, kernel2,............&event2); 




start = time(); 

clFinish(queue3); or clWaitforevents(..., event3); 

end = time(); 


Run time = end() - start(); 


event1_time += getKernelStartEndTime(event1); 

event2_time += getkernelstartendtime(event2); 

event3_time += getKernelStartEndTime(event3); 


Release events... 




I obtain Runtime (8ms) more than event3_time (4ms) in every iteration (factor of 2). I have tried using clFinish() as recommended in But I obtain the same results. This doesnt let me launch the kernels of the next iteration just after the previous. There is lot of time gap between kernel launch of one iteration and next. Is it common to obtain high overhead? 



Am I missing anything while launching kernels?
0 Kudos
1 Reply
Honored Contributor II

Mistake from my side - I havent removed the -profile option while compiling.