Hi,I have some basic questions regarding profiling metrics being reported after kernel execution. Let me mention, I have already took a look at Intel FPGA documentation, regarding profiling metrics, but some specific things are still vague for me. 1) What exactly is the stall percentage, and which factors can cause stall in the pipeline? 2) Which factors can play with occupancy factors? In cases with 0% stall and occupancy smaller than 100%, can we conclude there is an existence of overhead of scheduling threads on the compute units? 3) The profiling shows execution time of the kernel, which is completely different from OpenCL event timing and the wall-clock execution time. What's the reason for that? Thanks, Saman
Please check "Intel FPGA SDK for OpenCL Best Practices Guide, Section 4, Profiling Your Kernel to Identify Performance Bottlenecks". The answer to all of your questions and more is mentioned in that section.