We're developing a Cyclone V 5CSEMA5F31C6 FPGA product in which we want to use almost all RAM recources.
From the Cyclon V device overview, this FPGA has 397 M10K blocks, yielding a total of exactly 10240 * 397 = 4'065'280 Bits of memory.
Out of that, we want to use 3'704'995 Bits.
According the Supported Embedded Memory Block Configurations for Cyclone V Devices from the PDF linked above, this should be possible by using the configuration with 2048 addresses with 5 Bit data each (no parity for ECC).
However, we're unable to succesfully synthesize our VHDL code in Quartus Prime. The Fitter terminantes with an error stating that more than 397 M10K blocks are needed although less than the maximum amount of memory bits is used.
Having a look at the Recource usage summary, one can see that the design wants to use 453 M10K blocks for some reason. I attached an image with all errors.
Here is the VHDL code for the RAM:
LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; ENTITY SimpleDualPortRAM_generic IS GENERIC( AddrWidth : integer := 1; DataWidth : integer := 1 ); PORT( clk : IN std_logic; enb_1_5_0 : IN std_logic; wr_din : IN std_logic_vector(DataWidth - 1 DOWNTO 0); wr_addr : IN std_logic_vector(AddrWidth - 1 DOWNTO 0); wr_en : IN std_logic; rd_addr : IN std_logic_vector(AddrWidth - 1 DOWNTO 0); rd_dout : OUT std_logic_vector(DataWidth - 1 DOWNTO 0) ); END SimpleDualPortRAM_generic; ARCHITECTURE rtl OF SimpleDualPortRAM_generic IS -- Local Type Definitions TYPE ram_type IS ARRAY (740991 DOWNTO 0) of std_logic_vector(DataWidth - 1 DOWNTO 0); -- Signals SIGNAL ram : ram_type := (OTHERS => (OTHERS => '0')); SIGNAL data_int : std_logic_vector(DataWidth - 1 DOWNTO 0) := (OTHERS => '0'); SIGNAL wr_addr_unsigned : unsigned(AddrWidth - 1 DOWNTO 0); SIGNAL rd_addr_unsigned : unsigned(AddrWidth - 1 DOWNTO 0); BEGIN wr_addr_unsigned <= unsigned(wr_addr); rd_addr_unsigned <= unsigned(rd_addr); SimpleDualPortRAM_generic_process: PROCESS (clk) BEGIN IF rising_edge(clk) THEN IF enb_1_5_0 = '1' THEN IF wr_en = '1' THEN ram(to_integer(wr_addr_unsigned)) <= wr_din; END IF; data_int <= ram(to_integer(rd_addr_unsigned)); END IF; END IF; END PROCESS SimpleDualPortRAM_generic_process; rd_dout <= data_int; END rtl;
And its instancation:
u_dual_port_RAM3 : SimpleDualPortRAM_generic GENERIC MAP( AddrWidth => 20, DataWidth => 5 ) PORT MAP( clk => clk, ... ... );
Investigating the outcome in the Technology Map Viewer, we see that Quartus produces many RAM blocks with 8192 addresses each and 5-Bit word depth (also shown in the attached image).
What is the problem here? Why are we unable to use the whole resources?
Btw. if we reduce the number of RAM bits to approximately 1.17Mbit, then Quartus runs without errors.
I got a tip from a colleague that BRAM blocks might share some resources with DSP blocks. As a consequence, one cannot fully use a BRAM block if a DSP block is used, or at least not in any configuration.
Looking at the synthesis results, one can see that Quartus wants to use 453/397 M10K blocks, i.e. 56 too many. 56 is exactly the number of DSP blocks used in this design. So this is a strong evidence that Cyclon V FPGAs have this limitation too.
However, I was unable to find anything related to this in the documentation. So the problem remains unsolved.