Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
15663 Discussions

M20K RAM block usage question

Honored Contributor II

Suppose there are 8 workgroups, each workgroup contains 8 work items. 



I declare local memory in kernel function. 

__local float A[1000]; 



if I copy data from global memory, this kind of behavior will increase M20K RAM block usages? 

the total M20K is not "local memory size * workgroup number"? 

A[1000] * 8 






for (...) 

A[] = data from global memory 







0 Kudos
1 Reply
Honored Contributor II

Not all work-groups run fully in parallel on the FPGA. The compiler will decide how many work-groups can run in parallel. The M20K utilization will depend on the number of accesses to the buffer per work-group (which depends on the code and can also be affected by SIMD size), the number of work-groups running in parallel per compute unit (decided by the compiler), and the number of compute units (enforced by the user). The compiler report will explicitly mention why and how many times each local buffer is replicated, and how much the total size will be.