Application Acceleration With FPGAs
Programmable Acceleration Cards (PACs), DCP, FPGA AI Suite, Software Stack, and Reference Designs
477 Discussions

Intel opencl kernel execution

ADua0
Beginner
1,474 Views

In my opencl design I have multiple kernel which are executed one after the other. what i am seeing is , although let's say second kernel uses results of first kernel and waits for first kernel to finish, time difference between first kernel ending and second kernel starting is very significant. Can any tell what could be reason for that. I am using clclWaitForEvents so that second kernel can start execution after first kernel ends.

I have attach for example.

0 Kudos
5 Replies
HRZ
Valued Contributor III
1,066 Views

It depends on how long the gap is. There is certainly a kernel launch overheard. Moreover, when profiling, profile results are dumped to the disk between kernel executions which will further increase the gap. If you are saving the profiling results to a network-attached storage, the gap will get even larger.

ADua0
Beginner
1,066 Views

I am definitely profiling results, so by network attach storage does it mean that it storing of data to profile.mon file ? Also so if I do not profile and use "clGetEventProfilingInfo(event1,CL_PROFILING_COMMAND_START,sizeof(time_start),&time_start,NULL);

 clGetEventProfilingInfo(event1,CL_PROFILING_COMMAND_END,sizeof(time_end),&time_end,NULL); " to get time( time_end - time_starts) for two kernel and adding two t1 and t2 should give total runtime ?

0 Kudos
HRZ
Valued Contributor III
1,066 Views

>so by network attach storage does it mean that it storing of data to profile.mon file

Yes, if you are saving that file to a network-attached storage or a slow hard disk, it will increase the gap between kernel executions. This issue is mentioned in Intel's documents.

 

And yes, extracting start and end time of each kernel from its associated event and summing the run time of each will give you total run time of kernel executions, but that will not include the gap between the kernel executions. You can also use gettimeofday or clock_gettime functions on Linux to measure total run time including the gap from the host. Something like this:

 

start=gettimeofday();

enqueue_kernel_1;

clFinish();

enqueue_kernel_2;

clFinish();

end=gettimeofday();

 

total_time_with_gap=end-start;

 

You can subtract the run time of each kernel execution measured with clGetEventProfilingInfo from the above value to get the length of the gap between kernel executions.

0 Kudos
ADua0
Beginner
1,066 Views

Okay got it thanks. But do you know how much time should I expect for kernel launch ?

0 Kudos
HRZ
Valued Contributor III
1,066 Views

I have never measured that personally but it should be below 1 ms.

0 Kudos
Reply