Application Acceleration With FPGAs
Programmable Acceleration Cards (PACs), DCP, FPGA AI Suite, Software Stack, and Reference Designs
486 Discussions

Intel Opencl kernel compiling time

ADua0
Beginner
1,444 Views

I am trying to compile one of my kernel, but it is stuck at routing for more than 10 hours, but I have not even utilized even 50 % of the FPGA I have added my resource utilization report image and here is the message I got after 10 hours. Can anyone tell me what could be the possible reason for that ?

0 Kudos
7 Replies
Kenny_Tan
Moderator
1,097 Views
This usually cause by high utilization of the design for the particular are in the chip planner. Design will face this if your frequencies requirement is high. Can you try to follow the optimization base on the document https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/opencl-sdk/archives/aocl-best-practices-guide-15.0.pdf
0 Kudos
ADua0
Beginner
1,097 Views

Hi thanks for the reply. Can you suggest like which page you are referring to in the above document

0 Kudos
Kenny_Tan
Moderator
1,097 Views

I would suggest you read all. Also, try to run the design longer to see if it fit.

0 Kudos
HRZ
Valued Contributor III
1,097 Views

As it seems from the report, you have significant routing congestion in your design which is lengthening the routing process and there is no guarantee routing will ever succeed, either. Make sure you are using the latest BSP and compiler that supports your board and try to simplify your design. Specifically, try to avoid barrel shifters and large register-based buffers. Also, this is the latest version of the Best Practices Guide:

 

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/opencl-sdk/aocl-best-practices-guide.pdf

 

P.S. Remember that the area estimation is not necessarily accurate and actual utilization could be a lot higher (or lower) than the estimation.

0 Kudos
ADua0
Beginner
1,097 Views

I am using 18.0 Bsp version. I have lot of register , also I have some register structure such as "float __attribute__((register)) sum[4][2][16] ( ", although I am not able to understand how this 3d register is possible , do you know what that means ? could that also be a reason for long compile time

0 Kudos
HRZ
Valued Contributor III
1,097 Views

That buffer is not too large but if you have on-chip buffers that are by default implemented using Block RAMs by the compiler due to their size and you force implementation using registers by adding the "__attribute__((register))" attribute, then indeed that could cause routing congestion problems. Having large register-based buffers with a lot of reads/writes will result in high fan-in/fan-out and routing congestion in the area the buffer is implemented.

0 Kudos
Kenny_Tan
Moderator
1,097 Views
Still working on this?
0 Kudos
Reply