Application Acceleration With FPGAs
Programmable Acceleration Cards (PACs), DCP, DLA, Software Stack, and Reference Designs
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
426 Discussions

Register implementation in OpenCL

ADua0
Beginner
649 Views

I have using on-chip FPGA register for my implementation using the register attribute like below:

float __attribute__((register)) sum1[48];.

With that I am able to infer register for this sum1 variable , but what is does is that, it is inferred as 1 register with depth 1 and width 1536(48*32) . rather than having 48 register which I initially thought it would do . Does is seem to be a correct behavior for it to be 1 register or how to tell compiler to generate 48 registers rather than 1 register ?

0 Kudos
1 Solution
HRZ
Valued Contributor II
237 Views

This is normal behavior since you can access all those registers as sum1[i] in a loop and they essentially have the same address space. if you want 48 32-bit registers, you should define 48 separate variables with different names but then you will lose the ability to access them in a loop using the loop variable. At the end of the day this wouldn't make much of a difference since the mapper will decide how to map the buffer(s) to FPGA resources and depending on how you access the sum1 buffer, it might be implemented as 48 separate registers in the end.

View solution in original post

1 Reply
HRZ
Valued Contributor II
238 Views

This is normal behavior since you can access all those registers as sum1[i] in a loop and they essentially have the same address space. if you want 48 32-bit registers, you should define 48 separate variables with different names but then you will lose the ability to access them in a loop using the loop variable. At the end of the day this wouldn't make much of a difference since the mapper will decide how to map the buffer(s) to FPGA resources and depending on how you access the sum1 buffer, it might be implemented as 48 separate registers in the end.

Reply