Community
cancel
Showing results for 
Search instead for 
Did you mean: 
ADua0
Beginner
509 Views

Intel opencl kernel execution

In my opencl design I have multiple kernel which are executed one after the other. what i am seeing is , although let's say second kernel uses results of first kernel and waits for first kernel to finish, time difference between first kernel ending and second kernel starting is very significant. Can any tell what could be reason for that. I am using clclWaitForEvents so that second kernel can start execution after first kernel ends.

I have attach for example.

0 Kudos
5 Replies
HRZ
Valued Contributor II
101 Views

It depends on how long the gap is. There is certainly a kernel launch overheard. Moreover, when profiling, profile results are dumped to the disk between kernel executions which will further increase the gap. If you are saving the profiling results to a network-attached storage, the gap will get even larger.

ADua0
Beginner
101 Views

I am definitely profiling results, so by network attach storage does it mean that it storing of data to profile.mon file ? Also so if I do not profile and use "clGetEventProfilingInfo(event1,CL_PROFILING_COMMAND_START,sizeof(time_start),&time_start,NULL);

 clGetEventProfilingInfo(event1,CL_PROFILING_COMMAND_END,sizeof(time_end),&time_end,NULL); " to get time( time_end - time_starts) for two kernel and adding two t1 and t2 should give total runtime ?

HRZ
Valued Contributor II
101 Views

>so by network attach storage does it mean that it storing of data to profile.mon file

Yes, if you are saving that file to a network-attached storage or a slow hard disk, it will increase the gap between kernel executions. This issue is mentioned in Intel's documents.

 

And yes, extracting start and end time of each kernel from its associated event and summing the run time of each will give you total run time of kernel executions, but that will not include the gap between the kernel executions. You can also use gettimeofday or clock_gettime functions on Linux to measure total run time including the gap from the host. Something like this:

 

start=gettimeofday();

enqueue_kernel_1;

clFinish();

enqueue_kernel_2;

clFinish();

end=gettimeofday();

 

total_time_with_gap=end-start;

 

You can subtract the run time of each kernel execution measured with clGetEventProfilingInfo from the above value to get the length of the gap between kernel executions.

ADua0
Beginner
101 Views

Okay got it thanks. But do you know how much time should I expect for kernel launch ?

HRZ
Valued Contributor II
101 Views

I have never measured that personally but it should be below 1 ms.