My design have multiple kernel and I want to see which kernel utilizes major resources, is there any way I can do that after kernel is compiled . I need that as initial report generated and final utilization have different resource utilization
Using Quartus v18.1 and later (and maybe also v18.0 but I didn't check), when placement and routing is done, the summary page of the HTML report is updated with post-fit area utilization per kernel; you can refer to those numbers. However, even though the initial area estimation is inaccurate, the order of area usage by different kernels will likely not change after placement and routing.
I am using 18.0 and reports does not get update, I see the difference in my Ram Blocks( increased by 20 %) and logic utilization ( increased by 10 %) which I checked from Acl_quartus_report.txt, so I was wondering which kernel is causing it to increase the utilization. Any idea about that ?
That doesn't mean it is a specific kernel among your kernels that is causing the difference. The area estimation has inaccuracies associated with it as a whole, and the actual area utilization of every kernel in your design will be always different from the estimation. The area estimation is just supposed to give the user some idea of whether their kernel will fit on their target device or not, and what resources are expected to be used by each part of the design. There is no point in trying to find the reason for the differences between the area estimation and actual post-place-and-route utilization, and the actual area utilization in your case being higher than the estimation does not mean there is something wrong with your design that needs fixing.
Okay, I understand that after placing and routing as a whole, logic utilization might be high but for Ram blocks utilization is what is confusing for me. I thought Ram blocks are used when you have local memory buffer, or Dram Ports( I might be wrong here, ), so why there is such a difference in Ram block utilization ? can you clarify that
I believe Block RAMs could also be used to implement FIFOs throughout the pipeline for feedback paths and pipeline stages that only buffer data. Moreover, the estimation might assume certain buffers will be implemented using registers, while the mapper or placer might use Block RAMs to implement them due to routing or placement constraints or to improve timing. I tend to encounter Block RAM underestimation by the report more frequently with large designs that use a lot of FPGA area, which points to Block RAM overusage due to placement/routing constraints.
The line-by-line resource estimation is already in the HTML report, this should be enough to understand what operations the compiler is trying to use Block RAMs for. If you want post-place-and-route data, you can check the "top.fit.place.rpt" file where the resource utilization of all the components of the design are mentioned; however, unless you have complete knowledge of the OpenCL compiler and the way the system is created alongside with the thousands of modules and IP Cores involved in it, it would next to impossible to map this data back to the OpenCL kernel.
You can view the resources utilization from the report.html file.
Please check the document below: