Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
15319 Discussions

initialize large constant array as lookup table

Altera_Forum
Honored Contributor II
856 Views

I need a lookup table as multiplier. 

so I create an array and filled with integer multiply. 

and I try to implement like this. however, this takes very very long time to compile, 

so I thought this may be wrong to achieve what I want to implement ? Is that because 256*256 is too large ? 

Is there any better way to have same effect? 

 

 

__Kernel myKernel(__global int b,__global int c,__global int idx){ 

 

int lut[256][256]; 

for(int i=-128;i<128;i++){ 

for(int j=-128;j<128;j++){ 

conv[i+128][j+128] = i*j; 

 

int a=0;# pragma unroll 512 

for(int i=0;i<idxd;i++){ 

// int a+=b*c; 

int a+=lut[b+128][c+128]; 

 

 

--------------------------------------------- 

 

So I try to implement like this, however, compiler keeps telling me "constant data must be initialized." 

maybe it's because constant array can't initialize by for-loop, is there any way to initialize this instead of filling 256*256 data? 

 

 

constant int lut[256][256]; 

for(int i=-128;i<128;i++){ 

for(int j=-128;j<128;j++){ 

conv[i+128][j+128] = i*j; 

 

__Kernel myKernel(__global int b,__global int c,__global int idx){ 

 

int a=0;# pragma unroll 512 

for(int i=0;i<idxd;i++){ 

// int a+=b*c; 

int a+=lut[b+128][c+128]; 

 

}
0 Kudos
1 Reply
Altera_Forum
Honored Contributor II
89 Views

A 256x256 integer buffer requires 256 KB of on-chip memory which is quite big. Furthermore, your buffer needs to be replicated multiple times to allow parallel accesses (the replication factor will be reported in the area report) and hence, you might end up overutilzing the Block RAMs. High-end Stratix V and Arria 10 devices have ~6 MB of on-chip memory. Long compilation time is normal for large designs, or designs with large on-chip buffers. You can consider parameterizing the size of your buffer and playing around with the size to find the best configuration. Also you are unrolling your compute loop 512 times; apart from further increasing the replication factor for your on-chip buffer, your design is also going to run out of DSPs if you are targeting Stratix V. 

 

A constant value/array needs to be initialized with a value that is known at compile-time. You cannot write to a constant buffer at runtime; hence, your second code is not going to work.
Reply