Community
cancel
Showing results for 
Search instead for 
Did you mean: 
NSriv2
Novice
1,027 Views

How to determine the critical path determining Fmax in aoc 17.1?

Hi,

 

I looked up Intel Opencl for FPGA documentation but it doesn't tell what are the critical paths in kernels which are determining the clock frequency. Is there way to find such information?

 

Thanks,

Nitish

0 Kudos
2 Replies
HRZ
Valued Contributor II
66 Views

There is no straightforward way to find or optimize the critical path from the OpenCL kernel. You can recompile the project generated by the OpenCL compiler directly using Quartus and check the timing report, but mapping the HDL signals back to the OpenCL kernel will be next to impossible. Another thing you can do is to use the -fmax switch to increase the target operating frequency until the II of your loop goes up. In this case, you can find the path that is resulting in the increase in II in the report, which is basically the critical path, but this information is not necessarily accurate. Newer versions of the compiler (18+) give more accurate information in this case.

 

Based on my personal experience, the critical path of the kernel is usually in only a few places:

 

NDRange: The critical path for NDRange kernels nearly always falls on the 2x clock for Block RAM double-pumping. This effectively limits your operating frequency to 200-260 MHz on Arria 10 depending on area utilization, since the maximum operating frequency of the Block RAMs is 500-550 MHz depending on speed grade. If your kernel doesn’t use double-pumping, you can probably reach 350+ and Fmax will be either limited by placement and routing restrictions or the OpenCL BSP.

 

Single work-item: If you have loop-carried dependencies, i.e. you are reading some data every loop iteration that was updated in the previous loop iteration, the feedback path will limit your operating frequency to something between 150 and 220 MHz on Arria 10. The operating frequency cannot go higher in this case unless you sacrifice the loop II.

 

If you there are no loop-carried dependencies in the kernel, the critical path will be the chain of updates and comparisons on the loop variables for the deepest loop nest that has an II of one. Manual loop flattening alongside with, what I call, “loop exit condition optimization” can help achieve over 300 MHz on Arria 10 in such cases even with high area utilization. Still, even in this case, the deeper the original loop nest was, the lower your Fmax will be. Take a look at Sections 3.2.4.3 and 3.2.4.4 in the following document for more info on this optimization:

 

https://arxiv.org/ftp/arxiv/papers/1810/1810.09773.pdf

 

And associated example code can be found here:

 

https://github.com/fpga-opencl-benchmarks/rodinia_fpga/blob/master/opencl/hotspot/hotspot_kernel_v5....

 

DongWang-BJTU
New Contributor I
66 Views

You need to ues Quartus to open the generated project and report the timing for the critical path.

Reply