Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.

Kernel execution time

Vishvas
Novice
950 Views

In my program the execution time of the kernel is more than the execution time of the entire program

In the output screen (Attached Screenshot) 'Execution time of kernel' is measured using 'clGetEventProfilingInfo' 

and 'Execution time' is measured using clock_t, similar to this link and it measures the total time taken by the main function

This issue occurs only when I run my code on DevCloud, if I run it on my PC then Execution time > Kernel Execution time

Why is this happening? 

 

 

int main()
{
        .
        .
        .
	start = clock();
        .
        .
	err = clEnqueueNDRangeKernel(queue, multiply_ker, 1, NULL, &global, &local, 0, NULL, &event);
	clWaitForEvents(1, &event);
	clFinish(queue);
        .
        .
        .

	end = clock();

}

 

0 Kudos
1 Solution
HRZ
Valued Contributor II
888 Views

It is possible that the OS is reading the CPU clock incorrectly and setting the wrong value for "CLOCKS_PER_SEC". You can try with the high-precision "clock_gettime" function to see if it makes a difference. You can find the function information here:

https://linux.die.net/man/3/clock_gettime

 

And an example implementation here:

https://github.com/zohourih/FPGAMemBench/blob/master/common/timer.h

View solution in original post

6 Replies
ChithraJ_Intel
Moderator
934 Views

Hi Vishvas,


Thanks for reaching out us.

Could you please let us know the node in which you are running the application, Is it in FPGA nodes of Devcloud?


Regards,

Chithra


Vishvas
Novice
929 Views

The node is : s001-n139 

And Device is Arria 10  Platform as shown in the screenshot
I compiled the OpenCL code into RTL and ran it on FPGA. 

The issue I mentioned only occurs on FPGA, if I simulate it on CPU the timings are proper

ChithraJ_Intel
Moderator
921 Views

Hi Vishvas,


 Thanks for the information. Since your issue is related to FPGA, we are moving this query to FPGA forum for a faster response.


Regards,

Chithra


HRZ
Valued Contributor II
889 Views

It is possible that the OS is reading the CPU clock incorrectly and setting the wrong value for "CLOCKS_PER_SEC". You can try with the high-precision "clock_gettime" function to see if it makes a difference. You can find the function information here:

https://linux.die.net/man/3/clock_gettime

 

And an example implementation here:

https://github.com/zohourih/FPGAMemBench/blob/master/common/timer.h

AnilErinch_A_Intel
851 Views

Hi ,

Please let us know whether the issue is resolved using clock_gettime. If not we can look in to further possibilities.

Thanks and Regards

Anil


Vishvas
Novice
832 Views

I had some issues implementing it,  but I was able to complete it now

Thanks a lot for your help!

Reply