Re: Verilog - how to write parameterized RAM that infers a RAM block?

Altera_Forum · ‎02-20-2012

Does anyone know a way to write a RAM in Verilog with parameterized width and byte enables that causes Quartus to infer a RAM block?

Since Altera seems to be strongly discouraging direct use of altsyncram by removing all documentation, I am trying to find a way to directly code RAM. I do not need special features like initialization from a file, but i do need basic parameterization.

I have written a simple test which instantiates two RAM blocks. This first is inferred correctly but is not parmetrized. The second is properly parametrized but is not inferred. For the second, Quartus outputs the following:

Info (276007): RAM logic "ram2:m2|ram" is uninferred due to asynchronous read logic

Any solutions or ideas to try would be greatly appreciated.

** edit for clarification

Quartus does infer an altsyncram megafunction from the second RAM instantiation, but uses lcell registers instead of a RAM block.

** /edit


// top level, just instantiate two test cases
module ram_test#(
    parameter data_width = 32,
    parameter addr_width = 6,
    parameter bena_width = data_width / 8 
) (
    input clk,
    input  addr,
    input wena1,
    input wena2,
    input  wdata,
    input  bena,
    output  q1,
    output  q2
);
    ram1# (.data_width(data_width), .addr_width(addr_width)
    ) m1 (.clk(clk), .addr(addr), .wena(wena1), .wdata(wdata), .bena(bena),
            .q(q1)
    );
    ram2# (.data_width(data_width), .addr_width(addr_width)
    ) m2 (.clk(clk), .addr(addr), .wena(wena2), .wdata(wdata), .bena(bena),
            .q(q2)
    );
endmodule
//---------------------------------------------------------------------------
// Hard coded width, infers correctly
//---------------------------------------------------------------------------
module ram1# (
    parameter data_width = 32,  // works only for 32 bit
    parameter addr_width = 10,
    parameter bena_width = data_width / 8 
) (
    input clk,
    input  addr,
    input wena,
    input  wdata,
    input  bena,
    output reg  q
);
    localparam numwords = 2**addr_width;
    reg  ram ;
    always_ff@(posedge clk) begin
        if(wena) begin
            if(bena) ram <= wdata;
            if(bena) ram <= wdata;
            if(bena) ram <= wdata;
            if(bena) ram <= wdata;
        end
        q <= ram;
    end
endmodule
//---------------------------------------------------------------------------
// Parameterized width, does not infer a RAM block
//---------------------------------------------------------------------------
module ram2# (
    parameter data_width = 32,  // can be any multiple of 8
    parameter addr_width = 10,
    parameter bena_width = data_width / 8
) (
    input clk,
    input  addr,
    input wena,
    input  wdata,
    input  bena,
    output reg  q
);
    localparam numwords = 2**addr_width;
    reg  ram ;
    // create full write enable bit mask
    wire  wmask;
    genvar bytelane;
    generate
        for(bytelane=0; bytelane < bena_width; bytelane++) begin : lpbl
            assign wmask = wena ? {8{bena}} : 8'b0;
        end
    endgenerate
    // RAM
    always_ff@(posedge clk) begin
        ram <= (wmask & wdata) | (~wmask & ram);
        q <= ram;
    end
endmodule

Altera_Forum · ‎02-20-2012

You can generate an altsyncRAM module to a specific size, and parameterize it by hand, and then use that. This has worked for me on several designs.

/j

Altera_Forum · ‎02-21-2012

Open a verilog file in Quartus II and go to Edit -> Insert Template -> Verilog. There are a number of Verilog RAM inference files there.

Altera_Forum · ‎02-21-2012

--- Quote Start ---

Open a verilog file in Quartus II and go to Edit -> Insert Template -> Verilog. There are a number of Verilog RAM inference files there.

--- Quote End ---

The first of my two test instantiations, the one that works but is not properly parameterized, is based directly on those templates. I was hoping to find a way to fix the parameterization.

Altera_Forum · ‎02-21-2012

Your second design implements a variable bit enable mask (bena) and puts it into the below line

ram <= (wmask & wdata) | (~wmask & ram);

Bit enable is clearly beyond the ram block features and thus can't be inferred. I'm also not sure if Quartus may be able to understand the construct without bit enables, but I won't exclude it.

Altera_Forum · ‎02-21-2012

--- Quote Start ---

Your second design implements a variable bit enable mask (bena) and puts it into the below line

ram <= (wmask & wdata) | (~wmask & ram);Bit enable is clearly beyond the ram block features and thus can't be inferred. I'm also not sure if Quartus may be able to understand the construct without bit enables, but I won't exclude it.

--- Quote End ---

True, although it would be possible for a synthesizer to recognize that each group of 8 bit enables is just a fanned out byte enable. I would be surprised if Quartus were that smart.

However, what the synthesizer actually complains about is the appearance of ram[addr] in a combinatorial expression, since the ram block does not support an unregistered read. It does not recognize that the entire expression is just describing the behavior of the RAM, and actually tries to map this as an external path.

So, the question is, is there a different way to express this which Quartus can understand and that also has working parameters? I recognize the answer may simply be "no", but I'm asking anyway just in case anyone has some ideas.

Altera_Forum · ‎02-22-2012

Could you infer several memory blocks connected in parallel, each one of them handling a 8-bit data bus? Then you use the byte enables to generate write commands only to the relevant blocks.

Altera_Forum · ‎02-22-2012

I wrote an answer on another post talking about the recommended hdl styles (www.altera.com/literature/hb/qts/qts_qii51007.pdf) to infer proper memory blocks and saw they have an example of a memory with byte enables, on page 11-31. Maybe this will work?

Altera_Forum · ‎02-22-2012

--- Quote Start ---

Could you infer several memory blocks connected in parallel, each one of them handling a 8-bit data bus? Then you use the byte enables to generate write commands only to the relevant blocks.

--- Quote End ---

Yes, this works fine and is easily implemented by putting the always_ff block inside the generate for loop. Unfortunately this uses a minimum of bena_width RAM blocks regardless of the total size of the RAM because Quartus will not combine them.

Altera_Forum · ‎02-22-2012

--- Quote Start ---

I wrote an answer on another post talking about the recommended hdl styles (http://www.altera.com/literature/hb/qts/qts_qii51007.pdf) to infer proper memory blocks and saw they have an example of a memory with byte enables, on page 11-31. Maybe this will work?

--- Quote End ---

It "works", but the parameterization is broken in that the module body must be changed to support a different data width. It is the same as the SystemVerilog template that is accessible through the Quartus text editor.

Altera_Forum · ‎02-23-2012

--- Quote Start ---

However, what the synthesizer actually complains about is the appearance of ram[addr] in a combinatorial expression, since the ram block does not support an unregistered read. It does not recognize that the entire expression is just describing the behavior of the RAM, and actually tries to map this as an external path.

--- Quote End ---

It's possibly asking too much that the synthesis tool should be able to accept arbitrary equivalent bevavioral descriptions for inference. In the present case, I'm also not sure if it's strictly equivalent.

But I understand that your point is implementing byte enables in a behavioral description.

--- Quote Start ---

Unfortunately this uses a minimum of bena_width RAM blocks regardless of the total size of the RAM because Quartus will not combine them.

--- Quote End ---

Can you please give an example in which regard the recommended hdl style uses more RAM than required according to the hardware properties?

Altera_Forum · ‎02-23-2012

Maybe I'm missing something obvious here, and I have no experience in Verilog, but what is preventing you from putting ifs in the for loop instead of this combinatorial assignment?

As for the size when using separate 8-bit blocks, the only limitation I see is that Quartus will use a minimum of bena_width blocks to infer the memory. Yes it can be a problem if you are using large memory blocks and only need a small buffer.

Altera_Forum · ‎02-24-2012

--- Quote Start ---

Can you please give an example in which regard the recommended hdl style uses more RAM than required according to the hardware properties?

--- Quote End ---

The recommended HDL style does not use more RAM than required. The suggestion by Daixiwen to make Quartus infer a seperate 8-bit wide RAM block for each byte enable uses more RAM than required when the RAM size needed is less than the (block size) * (number of byte enables).

Altera_Forum · ‎02-24-2012

--- Quote Start ---

Maybe I'm missing something obvious here, and I have no experience in Verilog, but what is preventing you from putting ifs in the for loop instead of this combinatorial assignment?

As for the size when using separate 8-bit blocks, the only limitation I see is that Quartus will use a minimum of bena_width blocks to infer the memory. Yes it can be a problem if you are using large memory blocks and only need a small buffer.

--- Quote End ---

I think the for loop cannot be inside of the always_ff block. I will try it tomorrow and report back. If it cannot, then the only way to get the ifs in the for loop is to put the always_ff block in the for loop. This results in your original suggestion to create a separate memory for each byte enable.

The issue of RAM block waste makes the separate block implementation suitable "sometimes", for example if i need 1kB on a 64-bit bus and my device has M10K blocks then it uses 8 RAM blocks when it should use 1. The reason i want a RAM implementation with working parameters is so that i can write a generic module for a library that uses RAM as one of many components without having to use different code depending on the parameter values.

I think the best workaround for now may be to write a set of specific width RAM modules and use generate/if to select the module to use based on the width requested.

ps. Judging from the templates, I think VHDL has the same issue. The Altera VHDL templates for RAM with byte enables also requires the body to be changed if the data width is changed.

Altera_Forum · ‎02-24-2012

--- Quote Start ---

The recommended HDL style does not use more RAM than required. The suggestion by Daixiwen to make Quartus infer a seperate 8-bit wide RAM block for each byte enable uses more RAM than required when the RAM size needed is less than the (block size) * (number of byte enables).

--- Quote End ---

Daxiwen mentioned the byte enable example in the "Recommended HDL Style" document. You say, that the recommended HDL style does not use more RAM than required. So what's the problem?

Did you try the method suggested in the document?

P.S.:

--- Quote Start ---

It "works", but the parameterization is broken in that the module body must be changed to support a different data width.

--- Quote End ---

What do you mean with module body? The memory module or calling function? Accessing the RAM in bytes as required by the recommended coding style involves a change in the memory module. But it can be completely hidden inside the module.

Altera_Forum · ‎02-24-2012

--- Quote Start ---

Judging from the templates, I think VHDL has the same issue. The Altera VHDL templates for RAM with byte enables also requires the body to be changed if the data width is changed.

--- Quote End ---

No I'm pretty sure in VHDL i could make a generic memory block with byte enables. Something like this:

process(clk)
  begin
    if(rising_edge(clk)) then 
      if(we = '1') then
        for block_num in 0 to bena_width-1 loop
          if(be(block_num) = '1') then
            ram(waddr)(block_num) <= wdata((block_num*8)+7 downto (block_num*8));
          end if;
        end loop;
      end if;
    q_local <= ram(raddr);
  end if;
end process;

I haven't tested it, but it should be recognized properly by the synthesizer.

Altera_Forum · ‎02-24-2012

--- Quote Start ---

Daxiwen mentioned the byte enable example in the "Recommended HDL Style" document. You say, that the recommended HDL style does not use more RAM than required. So what's the problem?

Did you try the method suggested in the document?

--- Quote End ---

Yes, the "ram1" module in the code in my original post is the method suggested in the document. The problem is that the "data_width" parameter doesn't work.

--- Quote Start ---

What do you mean with module body? The memory module or calling function? Accessing the RAM in bytes as required by the recommended coding style involves a change in the memory module. But it can be completely hidden inside the module.

--- Quote End ---

I mean the memory module, and yes, it can be hidden inside the module with a set of hard coded memories selected by generate blocks. That seems to be what i need to do. I usually try to avoid that approach if possible because it does not scale well if more than one parameter needs to be handled this way, but in this case it seems to be necessary.

Altera_Forum · ‎02-24-2012

--- Quote Start ---

No I'm pretty sure in VHDL i could make a generic memory block with byte enables. Something like this:

I haven't tested it, but it should be recognized properly by the synthesizer.

--- Quote End ---

Interesting. I don't know VHDL, but that looks like it would do what i need. Because of differences in language structure, the analogous construct in SystemVerilog causes the synthesizer to generate a RAM block per byte lane.

Altera_Forum · ‎02-24-2012

--- Quote Start ---

Interesting. I don't know VHDL, but that looks like it would do what i need. Because of differences in language structure, the analogous construct in SystemVerilog causes the synthesizer to generate a RAM block per byte lane.

--- Quote End ---

That really sounds like a bug. I didn't expect it. Thanks for clarifying.

Altera_Forum · ‎06-29-2012

how to write my outputs into a file in verilog, can u plz xplain with an example..... and also read from the file and then my module responds......:confused:

Altera_Forum · ‎06-30-2012

Use something like this instead.

Beware: I'm not sure I got the indexes order correct.

    generate
        for(bytelane=0; bytelane < bena_width; bytelane++) begin : lpbl		
				always_ff @ (posedge clk) begin
					if (wena && bena) 
						ram <= wdata;
					q <= ram;
				
				end
				
        end
    endgenerate