Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
702 Discussions

Weird memory usage on fpga

murrtank
Novice
3,095 Views

I am trying to make the matrix multiplication example from the oneAPI sample on FPGA. But when I do the report compilation I get that I am using up more ram and DSP memory then it is on the board. But I did some calculations and should not be using that much memory.  If I run it with only the parallel_for it does not give anything weird, but when I try to run it with the single_task I get a lot more memory usage. What am I doing wrong?

0 Kudos
1 Solution
DDIAKITE
New Contributor I
2,851 Views

Hi,

The 8 GB you're talking about is the global memory size, and that doesn't seem to be the problem here. The problem is that you are exceeding the on-chip BRAM blocks available on your FPGA, which is not 8 GB. Assuming that you are using an Arria 10 or Stratix 10 device, the BRAM is an order of Mega Bytes and way less than 8 GB  In addition, you also have much more DSP usage and not just memory. When I saw your code, you had excessive loop unrolling in the design exceeding your hardware usage.  Your loops over "P" or "M" like this one (for (int jway = 0; jway < P; jway++) ) cannot be fully unrolled for your FPGA device. Fully unrolling this loop means you need 4096 instances of this loop in parallel, replicating all the loop body, including memory and DSP. For instance, a floating point multiplication within this loop will require 4096 DSP slices for this single unrolling. But you have several full-unrolled loops.

Also, the compiler may implement a cache for your memory accesses using BRAM blocks.  Your unrolling might explode the memory usage as well.

You need to specify your unrolling factor and tune it so that the design can fit into the FPGA device. You can remove all the unrolling first to see how it fits, and then optimize your design by tuning the unrolling factor.

Best regards,
Daouda

View solution in original post

0 Kudos
10 Replies
hareesh
Employee
2,917 Views

Hi,

Can you please share error message screen shot?


Thank you,


0 Kudos
murrtank
Novice
2,912 Views

I just get a warning that it is using more the max capacity of RAM. I get an error if I remove the  -Xsdont-error-if-large-area-est  flag. Here is a picture of the area estimation from the report. I have also done some calculations on it, and with the 8 GB of RAM it should be able to host all the three matrices in its largest forms. 4 byte floats x 4096^2 elements x 3 matrices = 402653184 Bytes < 8 GB, so I don't get why the reports says I'm using 264 GB of RAM.

0 Kudos
DDIAKITE
New Contributor I
2,852 Views

Hi,

The 8 GB you're talking about is the global memory size, and that doesn't seem to be the problem here. The problem is that you are exceeding the on-chip BRAM blocks available on your FPGA, which is not 8 GB. Assuming that you are using an Arria 10 or Stratix 10 device, the BRAM is an order of Mega Bytes and way less than 8 GB  In addition, you also have much more DSP usage and not just memory. When I saw your code, you had excessive loop unrolling in the design exceeding your hardware usage.  Your loops over "P" or "M" like this one (for (int jway = 0; jway < P; jway++) ) cannot be fully unrolled for your FPGA device. Fully unrolling this loop means you need 4096 instances of this loop in parallel, replicating all the loop body, including memory and DSP. For instance, a floating point multiplication within this loop will require 4096 DSP slices for this single unrolling. But you have several full-unrolled loops.

Also, the compiler may implement a cache for your memory accesses using BRAM blocks.  Your unrolling might explode the memory usage as well.

You need to specify your unrolling factor and tune it so that the design can fit into the FPGA device. You can remove all the unrolling first to see how it fits, and then optimize your design by tuning the unrolling factor.

Best regards,
Daouda

0 Kudos
murrtank
Novice
2,803 Views
0 Kudos
hareesh
Employee
2,880 Views

Hi,

i need your project details means are working on your own design or some example one?

if it's possible to share your project files share. it helps to identify the issue.


Thanks,


0 Kudos
hareesh
Employee
2,799 Views

Hi,

i think you got solution if you don't have any queries i'll close case. please confirm.


0 Kudos
murrtank
Novice
2,783 Views
0 Kudos
hareesh
Employee
2,777 Views

Hi,


thanks for conformation. now i am closing the case. if you reopen the case pls follow bellow link

Please login to ‘https://supporttickets.intel.com


0 Kudos
Reply