Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
573 Discussions

Weird memory usage on fpga

murrtank
Novice
675 Views

I am trying to make the matrix multiplication example from the oneAPI sample on FPGA. But when I do the report compilation I get that I am using up more ram and DSP memory then it is on the board. But I did some calculations and should not be using that much memory.  If I run it with only the parallel_for it does not give anything weird, but when I try to run it with the single_task I get a lot more memory usage. What am I doing wrong?

0 Kudos
1 Solution
DDIAKITE
New Contributor I
431 Views

Hi,

The 8 GB you're talking about is the global memory size, and that doesn't seem to be the problem here. The problem is that you are exceeding the on-chip BRAM blocks available on your FPGA, which is not 8 GB. Assuming that you are using an Arria 10 or Stratix 10 device, the BRAM is an order of Mega Bytes and way less than 8 GB  In addition, you also have much more DSP usage and not just memory. When I saw your code, you had excessive loop unrolling in the design exceeding your hardware usage.  Your loops over "P" or "M" like this one (for (int jway = 0; jway < P; jway++) ) cannot be fully unrolled for your FPGA device. Fully unrolling this loop means you need 4096 instances of this loop in parallel, replicating all the loop body, including memory and DSP. For instance, a floating point multiplication within this loop will require 4096 DSP slices for this single unrolling. But you have several full-unrolled loops.

Also, the compiler may implement a cache for your memory accesses using BRAM blocks.  Your unrolling might explode the memory usage as well.

You need to specify your unrolling factor and tune it so that the design can fit into the FPGA device. You can remove all the unrolling first to see how it fits, and then optimize your design by tuning the unrolling factor.

Best regards,
Daouda

View solution in original post

10 Replies
hareesh
Employee
497 Views

Hi,

Can you please share error message screen shot?


Thank you,


murrtank
Novice
492 Views

I just get a warning that it is using more the max capacity of RAM. I get an error if I remove the  -Xsdont-error-if-large-area-est  flag. Here is a picture of the area estimation from the report. I have also done some calculations on it, and with the 8 GB of RAM it should be able to host all the three matrices in its largest forms. 4 byte floats x 4096^2 elements x 3 matrices = 402653184 Bytes < 8 GB, so I don't get why the reports says I'm using 264 GB of RAM.

DDIAKITE
New Contributor I
432 Views

Hi,

The 8 GB you're talking about is the global memory size, and that doesn't seem to be the problem here. The problem is that you are exceeding the on-chip BRAM blocks available on your FPGA, which is not 8 GB. Assuming that you are using an Arria 10 or Stratix 10 device, the BRAM is an order of Mega Bytes and way less than 8 GB  In addition, you also have much more DSP usage and not just memory. When I saw your code, you had excessive loop unrolling in the design exceeding your hardware usage.  Your loops over "P" or "M" like this one (for (int jway = 0; jway < P; jway++) ) cannot be fully unrolled for your FPGA device. Fully unrolling this loop means you need 4096 instances of this loop in parallel, replicating all the loop body, including memory and DSP. For instance, a floating point multiplication within this loop will require 4096 DSP slices for this single unrolling. But you have several full-unrolled loops.

Also, the compiler may implement a cache for your memory accesses using BRAM blocks.  Your unrolling might explode the memory usage as well.

You need to specify your unrolling factor and tune it so that the design can fit into the FPGA device. You can remove all the unrolling first to see how it fits, and then optimize your design by tuning the unrolling factor.

Best regards,
Daouda

hareesh
Employee
460 Views

Hi,

i need your project details means are working on your own design or some example one?

if it's possible to share your project files share. it helps to identify the issue.


Thanks,


hareesh
Employee
379 Views

Hi,

i think you got solution if you don't have any queries i'll close case. please confirm.


murrtank
Novice
363 Views
hareesh
Employee
357 Views

Hi,


thanks for conformation. now i am closing the case. if you reopen the case pls follow bellow link

Please login to ‘https://supporttickets.intel.com


Reply