Inferring RAM, registered on inputs only

Altera_Forum · ‎01-17-2013

I need a dual-port RAM with registered inputs and unregistered outputs. I can easily create this using the megafunction wizard or by instantiating an altsyncram directly. Unfortunately, I also need to be able to initialize its contents using information supplied by a generic parameter, thus ruling out simply creating a static initialization file.

Inferred RAMs can quite easily be initialized like this:

type ram_t is array(2**g_addr_bits-1 downto 0) of std_logic_vector(g_data_bits-1 downto 0);

signal ram : ram_t := f_some_function(g_parameter);

... so I was thinking I would just use an inferred memory instead of the altsyncram.

However, I have been completely unable to coax quartus into inferring the required register placement. I've read the section "Inferring Memory Functions from HDL code" of the document "Recommended HDL coding styles". Unfortunately, that document only proposes two alternatives: 1) a reads-old-data example with registers on write inputs and read output. ie: the register is on the wrong side of the read port. 2) a reads-new-data example with registers on the inputs... but addition bypass logic i don't want.

Does anyone have any idea how I could get a solution here? So far the best I can think of is to use an altsyncram directly and add some fancy reset-triggered initialization process. That will put another MUX on my critical path, however.

This is frustrating, because I know the hardware can do what I want, but I can't communicate my intentions to quartus.

Altera_Forum · ‎01-17-2013

Well for a start, "reset triggered initialisation process" is not possible on any ram inside any device. Try using a reset to get a value in a ram, and you'll only get logic.

How about posting the code youve got and stating which device you're targetting. If Im reading what you want correct, what you want should be possible and inferable.

Altera_Forum · ‎01-17-2013

--- Quote Start ---

Well for a start, "reset triggered initialisation process" is not possible on any ram inside any device.

--- Quote End ---

I think you misunderstood me. The memory in an Altera FPGA doesn't have it's own reset logic, but nothing prevents you from implementing your own. What I mean is: make a process which upon seeing the reset signal, switches the inputs of the RAM over to a counter that walks the RAM cell-by-cell, clearing it. After a number of cycles = size of the memory, the MUX is switched back to the input/output ports.

Altera_Forum · ‎01-17-2013

Here's a sanitized version of code that can be inferred. Both options are incorrect for my purposes.# 1 is registered on the wrong side.# 2 has unintended read-new-data bypass logic


library ieee;
use ieee.std_logic_1164.all;
entity my_memory is
  generic(
    g_addr_bits : natural := 8;
    g_data_bits : natural := 8);
  port(   
    clk_i      : in  std_logic;
    r_en_i     : in  std_logic; 
    r_addr_i   : in  std_logic_vector(g_addr_bits-1 downto 0);
    r_data_o   : out std_logic_vector(g_data_bits-1 downto 0);
    w_en_i     : in  std_logic;
    w_addr_i   : in  std_logic_vector(g_addr_bits-1 downto 0);
    w_data_i   : in  std_logic_vector(g_data_bits-1 downto 0));
end my_memory;
architecture rtl of my_memory is
  type ram_t is array(2**g_addr_bits-1 downto 0) of
    std_logic_vector(g_data_bits-1 downto 0);
    
  signal ram : ram_t := ...; -- Magic initial value goes here
  signal r_addr : std_logic_vector(g_addr_bits-1 downto 0);
begin
  
  main : process(clk_i)
    begin
      if rising_edge(clk_i) then
        if w_en_i = '1' then
          ram(to_integer(unsigned(w_addr_i))) <= w_data_i;
        end if;
        -- Option# 1: reads old-data
        -- Registered on output (r_data_o), not input (r_addr_i)
        r_data_o <= ram(to_integer(unsigned(r_addr_i)));
        -- Option# 2: registered on input
        r_addr <= r_addr_i;
      end if;
    end process;
  
  -- Option# 2: no register on output, but implies reads new-data
  -- (reading new data requires costly bypass logic from quartus)
  r_data_o <= ram(to_integer(unsigned(r_addr)));
end rtl;

Altera_Forum · ‎01-17-2013

Here's a version that has the correct register behaviour (registered inputs and unregistered outputs) and the correct read-write behaviour (reads old data). Unfortunately, it is not possible to initialize it using a generic parameter.


library ieee;
use ieee.std_logic_1164.all;
library altera_mf;
use altera_mf.altera_mf_components.all;
entity my_memory is
  generic(
    g_addr_bits : natural := 8; 
    g_data_bits : natural := 8);
  port(
    clk_i      : in  std_logic;
    r_en_i     : in  std_logic;
    r_addr_i   : in  std_logic_vector(g_addr_bits-1 downto 0);
    r_data_o   : out std_logic_vector(g_data_bits-1 downto 0);
    w_en_i     : in  std_logic;
    w_addr_i   : in  std_logic_vector(g_addr_bits-1 downto 0); 
    w_data_i   : in  std_logic_vector(g_data_bits-1 downto 0));
end my_memory;
architecture rtl of my_memory is
begin
  altsyncram_component : altsyncram
    generic map (
      --intended_device_family => "Arria II GX",   
      address_aclr_b                     => "NONE",  
      address_reg_b                      => "CLOCK0",
      clock_enable_input_a               => "BYPASS",
      clock_enable_input_b               => "BYPASS",
      clock_enable_output_b              => "BYPASS",
      lpm_type                           => "altsyncram",  
      numwords_a                         => 2**g_addr_bits,
      numwords_b                         => 2**g_addr_bits,
      operation_mode                     => "DUAL_PORT",
      outdata_aclr_b                     => "NONE",
      outdata_reg_b                      => "UNREGISTERED",
      power_up_uninitialized             => "FALSE", 
      rdcontrol_reg_b                    => "CLOCK0",  
      read_during_write_mode_mixed_ports => "OLD_DATA",    
      widthad_a                          => g_addr_bits,
      widthad_b                          => g_addr_bits,
      width_a                            => g_data_bits,   
      width_b                            => g_data_bits,
      width_byteena_a                    => 1)
    port map (
      clock0    => clk_i, 
      wren_a    => w_en_i,  
      address_a => w_addr_i,
      data_a    => w_data_i,
      rden_b    => r_en_i,  
      address_b => r_addr_i, 
      q_b       => r_data_o);
end rtl;

Altera_Forum · ‎01-17-2013

The device families I am targeting are Aria 2 and 5. These have M9K and M10K blocks which support the behaviour I desire: read-old-data, registers inputs, unregistered outputs.

Altera_Forum · ‎01-17-2013

For the alt sync ram method, you can specify an initialisation file of a .mif or .hex via the generics.

But for the inference, I think you may be stuck. Xilinx allow you infer the read/write first by using a shared variable or a signal for the ram storage, but Altera Doesnt and you're stuck with write before read for inferrance.

I might be completely wrong, because the "ramstyle" synthesis attribute has the following to say:

--- Quote Start ---

In addition to specifying the type of memory block for the RAM implementation, by setting the value to "no_rw_check", you can use the ramstyle attribute to indicate that you do not care about the output of the inferred RAM when there are simultaneous reads and writes to the same address. By default, the Quartus II software tries to create an inferred RAM with the same read-during-write behavior as your HDL source. In some cases, a RAM must be mapped into logic because it has a read-during-write behavior that is not supported by the memory blocks in your target device. In other cases, the Quartus II software must insert extra logic to mimic your read-during-write behavior, which can increase the resource requirements or reduce the performance of your design. Setting the "no_rw_check" value directs the Quartus II Compiler that the read-during-write behavior of the HDL source does not need to be preserved.

--- Quote End ---

If you cant get it to work, a support case with mysupport might be called for, with a potential enhancement request.

Altera_Forum · ‎01-17-2013

--- Quote Start ---

For the alt sync ram method, you can specify an initialisation file of a .mif or .hex via the generics.

--- Quote End ---

Yeah. Unfortunately, the entity is used deep inside a parameterized piece of code. The size of the RAM needs to be flexible and the initial value needs to fill the entire size. Unless you know a trick to generate the mif/hex during synthesis, based on parameters, I don't think this will work for me.

--- Quote Start ---

Xilinx allow you infer the read/write first by using a shared variable or a signal for the ram storage

--- Quote End ---

Could you paste an example of that? We also work with Xilinx here.

--- Quote Start ---

the "ramstyle" synthesis attribute...

--- Quote End ---

I'm investigating this direction now. If that doesn't pan out, I guess I will just go with a custom initilization process + mux and eat the space+speed hit.

Thanks for replying.

Altera_Forum · ‎01-17-2013

--- Quote Start ---

Could you paste an example of that? We also work with Xilinx here.

--- Quote End ---

just make ram a shared variable rather than a signal:

shared variable ram : ram_t;

Altera_Forum · ‎01-18-2013

I've solved this problem. For anyone who finds this via google:

Even though the VHDL for the "old-data" inferred memory *looks* like it has registers on the output, it does not. You can confirm this by checking the parameters of the altsyncram in the hierarchy viewer. The outdata_reg_a = UNREGISTERED. So, RAM inference in quartus respects the semantics of what you wrote, but not the timing behaviour. Which, for me, is a good thing!

Part of the reason this gets confusing is because quartus displays RAM blocks differently in the RTL viewer and in the megafunction wizard. In the RTL viewer, the RAM you see is a hypothetical *asynchronous* RAM that probably cannot be realized in hardware as shown. By comparison, the megafunction wizard shows you the reality of the inputs/outputs for a synchronous RAM.

For example, consider a simple dual ported memory inferred from VHDL. In the RTL view, the "old-data" pattern looks like it has registers on the w_addr_i and w_data_i which feed into the (asynchronous!) RAM. The r_addr_i feeds in directly and has a register on the output of the asynchronous RAM (r_data_o). Do not be deceived! This is what your VHDL looks like AND what the RTL view shows. However, when implemented in hardware, the registers are all before the *synchronous* memory and the output is actually unregistered.

If you produce the same simple dual ported memory in the megafunction wizard, it will show all the inputs registered before the synchronous memory, just like it will be implemented in the FPGA. You can easily confirm the parameters to the altsyncram end up the same (especially outdata_reg_a)... indeed, altsyncram MF always registers it's inputs.

Conversely, the "new data" memory cannot be created by the megafunction wizard. That makes sense because to get this behaviour requires bypass logic to feed the w_data_i to the r_data_o. So the megawizard will never show you a diagram for this situation. If it did, you would probably see a MUX at the end of the syncram.

When you look at "new data" dual ported memory inferred from VHDL, it will appear to have registers on r_addr_i, w_addr_i, and w_data_i. The r_data_o will appear to be wired directly out of the asynchronous RAM. Again, do not be deceived! There is no asynchronous RAM once this is implemented in the FPGA. Just like in the picture, all the inputs end up registered (since synchronous memory always registers its inputs), BUT the output is not wired as shown. There is a MUX feeding w_data_i to r_data_o.

So long story short: I was worried about nothing. The straight-forward VHDL inference does in fact turn out the way I wanted: "old-data", registered inputs, unregistered outputs.

Really, this is an important thing to know! The VHDL looks like the output is a register, but it is in fact not; it is wired to the output of the synchronous RAM and thus you have paid this latency. You should plan on registering it again if you need significant logic after the RAM.

Altera_Forum · ‎01-18-2013

have you tried with a shared variable instead of a signal, so see how it affects the read-before-write behaviour? it never used to make a difference, would be nice to see if Altera fixed it.

Altera_Forum · ‎01-18-2013

I have not tried the shared variable. However, I think it would just be a matter of the order in which you assigned to the variable?

-- New data

ram(w_addr_i) := w_data_i;

r_data_o <= ram(r_addr_i);

-- Old data

r_data_o <= ram(r_addr_i);

ram(w_addr_i) := w_data_i;

... in any case, why would you ever want a "new data" RAM? It requires bypass logic and can always be avoided with a well designed pipeline. The "old data" approach works with a signal, so I will just stick with that.