- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The frequency of my FPGA code is about 240MHz, so we want to improve it.
We add ‘-max-fanout=1024 --fmax 300’ in the aoc command, but it has no effect on increase frequency. In paper ‘Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network ’, noted that ‘To achieve a higher working frequency, we use register duplication to limit the maximum fan-out to 100. We found that the paths with the highest fan-out are the control signals, which are generated by the dispatcher and connected to all of the PEs.’. So, how to use register duplication to limit the maximum fan-out with OpenCL, and how to find the paths with the highest fan-out? Thanks a lot for your help!Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
240 MHz is relatively good for an OpenCL design with mid to high area usage. That paper you are referring to is NOT an OpenCL design. They have done the design in System Verilog and then wrapped it in an OpenCL kernel as an HDL library. They have set the maximum fan-out and other settings in Quartus rather than passing them to the OpenCL compiler. I don't think the OpenCL compiler even supports overriding the default value for maximum fan-out.
For single work-item kernels, assuming that it is applicable to your case, you can use loop collapse and exit condition optimization to improve operating frequency to above 300 MHz as outlined in this paper: https://dl.acm.org/citation.cfm?id=3174248 For NDRange kernels there is pretty much no user control over the critical path and with mid to high area usage, it is next to impossible to achieve higher than 260 MHz. You might be able to improve the operating frequency by 10-20 MHz using seed and fmax sweeping but you have to compile many variations (10-20).- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- 240 MHz is relatively good for an OpenCL design with mid to high area usage. That paper you are referring to is NOT an OpenCL design. They have done the design in System Verilog and then wrapped it in an OpenCL kernel as an HDL library. They have set the maximum fan-out and other settings in Quartus rather than passing them to the OpenCL compiler. I don't think the OpenCL compiler even supports overriding the default value for maximum fan-out. For single work-item kernels, assuming that it is applicable to your case, you can use loop collapse and exit condition optimization to improve operating frequency to above 300 MHz as outlined in this paper: https://dl.acm.org/citation.cfm?id=3174248 For NDRange kernels there is pretty much no user control over the critical path and with mid to high area usage, it is next to impossible to achieve higher than 260 MHz. You might be able to improve the operating frequency by 10-20 MHz using seed and fmax sweeping but you have to compile many variations (10-20). --- Quote End --- Thank you very much for your help!

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page