Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16875 Discussions

RAM uninferred due to asynchronous read logic

Altera_Forum
Honored Contributor II
6,710 Views

Hello, 

 

I have a cache module which uses an inferred synchronous RAM block. When I synthesize the module by itself (as the project's top module) the synthesis works as intended and the RAM is inferred correctly. 

When I instantiate the module from another module, the RAM inference fails with this error message: 

 

--start of Quartus II error message 

Info: RAM logic "mips_cache:cache|code_line_table" is uninferred due to asynchronous read logic 

-- end of error message 

 

 

This the process that should infer the RAM and fails: 

 

... code_line_memory: process(clk) begin if clk'event and clk='1' then if ps=code_refill_bram_1 or ps=code_refill_sram8_3 or ps=code_refill_sram_1 then code_line_table(conv_integer(code_word_addr_wr)) <= code_refill_data; end if; code_cache_rd <= code_line_table(conv_integer(code_word_addr)); end if; end process code_line_memory; ... please note that the RAM output is registered. 

It is identical to other process in the same module that does work, and it's identical to the VHDL RAM templates I've been using for years (as far as I can tell). 

 

When I re-register the output signal (that is, load code_cache_rd on a register before feeding it to the rest of the circuit), then the error disappears but at the cost of an extra delay cycle that should not be necessary (since the RAM output is already registered). 

 

I have tried a number of random changes, with an increasing level of desperation, and can't figure what's wrong. 

 

I'm using Quartus-II 9.0 build 235. 

 

Any help will be appreciated. 

 

Thanks!
0 Kudos
19 Replies
Altera_Forum
Honored Contributor II
3,889 Views

Is code_word_addr the output of some other set of logic? It may be complaining because it cant place the read address in a register internal to the ram. The code you posted shows a registered output, rather than a registered read address. All the code templates that altera provides ask for a registered read address.  

 

The read data is then asynchronously read out from the registered read address with optional output registers.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

just to show you what I mean, here are two lots of code, one that should infer a ram and another that might not. They are similar but subtly different. 

 

signal read_addr : integer; signal mem : some_ram_t; -------------------------------- --Should infer a ram -------------------------------- process(clk) begin if rising_edge(clk) then read_addr <= some_logic; end if; end process; output <= mem(read_addr); ------------------------------------- --Async read - cant infer a ram ------------------------------------- read_addr <= some_logic; process(clk) begin if rising_edge(clk) then output <= mem(read_addr); end if; end process;
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Thanks for your reply! 

 

I have tried registering the address (and then reading from the array outside of the synchronous process) and it makes no difference. Even if I do this it still fails: 

 

... code_line_memory: process(clk) begin if clk'event and clk='1' then if ps=code_refill_bram_1 or ps=code_refill_sram8_3 or ps=code_refill_sram_1 then code_line_table(conv_integer(code_word_addr_wr)) <= code_refill_data; end if; -- register address code_word_addr <= code_rd_addr(11 downto 2); code_cache_rd <= code_line_table(conv_integer(code_word_addr)); end if; end process code_line_memory; ... I have tried registering the address in a separate process, just in case I was confusing the synthesizer. Still didn't work. 

 

The only thing that seems to work so far is registering code_cache_rd (that is, registering it again, by loading it on a register). But I can't accomodate an extra delay cycle in my design. 

 

The templates I talk about are those included in Quartus-2 and those shown in the manual (chap. 6 'recommended hdl coding styles'). Both are analogous to the code snippet I posted, i.e. unregistered address and registered output (though the actual hardware implementation may be different, I realize). I mean, my code is almost verbatim from the Quartus-2 manual template: 

 

... PROCESS (clock) BEGIN IF (clock'event AND clock = '1') THEN IF (we = '1') THEN ram_block(write_address) <= data; END IF; q <= ram_block(read_address); -- VHDL semantics imply that q doesn't get data -- in this clock cycle END IF; ... END PROCESS; Note that I don't care about the read-during-write behavior in this design so reading old data is fine.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

have you tried it in a newer version of quartus? 

 

I can only think then there is something wrong elsewhere in the code. Have you tried raising a Mysupport request?
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

I too guess there is something wrong with the code but I'm now reduced to changing it at random hoping to get some clue... 

 

I have tried the same code with ISE and the synthesis looks OK (the code is vendor-agnostic). I will search the warnings and will try using synplify, to see if I can find what's wrong (though that'll have to wait until monday). 

 

It will take me days to try another version of quartus; it's a 3GB download and a full reinstall. I hope to catch the error before the download is finished :) Note there are no service packs for the web edition of Quartus that I'm using, it's full download or nothing. 

 

As for the MySupport, this is a private project not related to any company (i.e. I can't use a company email as required by MySupport) so I can't open a case. The web application won't even let me search existing cases (?). I will search the forum more carefully... 

 

Thanks for your help. Any hints or ideas are welcome! 

 

I'll keep track of my progress in this thread.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Try putting your inferred RAM into a partition.

0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

I am trying to install Quartus 2 10.1 on Kubuntu... a nightmare on its own right. I wonder if the guys at Altera have ever heard of apt-get or Ubuntu... When and if I get the install done I will try my code on it. 

 

As for the partition, I have first to figure what a 'partition' is in this context :) thanks for the tip. 

The manual warns you that RAM inference can fail of the RAM output goes straight to a hierarchy boundary (which it does). But I have used this exact same construct in the past with no trouble, and as I said, when you compile the cache module separately (i.e. not connected to a parent module) it works fine.  

I will try moving the RAM to a separate module and see if it helps...
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

i think i've run into this before 

 

are code_word_addr and code_word_addr_wr registered or combinational? try registering them
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Hello, 

 

Signals code_word_addr and code_word_addr_wr are slices of an input signal that itself is the output of an adder. They are combinational. 

 

I have tried registering the read address (code_word_addr) with no result. I haven't tried registering the write address, though. I will try.  

 

Anyway, registering the addresses may be useful for diagnostic but I can't accomodate the extra latency cycle in the design.  

Neither can I 'move' the module boundaries so that the RAM output does not go straight to the module output (the manual suggested that might be the problem).  

 

You see, this BRAM is the line store of a direct-mapped code cache for a MIPS core. The core already works with a 'stub' 1-word cache (which uses no RAM). The trouble started when I began implementing the real cache. The cache is meant to be wired directly to the CPU ports. In this particular case, the output of the code line store goes to the CPU module, where it is registered. BUT, before registering, a slice the unregistered CPU input (i.e. the cache RAM output) is used as the address input for the register bank (which itself is another RAM block). So I just have to use the RAM block output. I hope that made sense... 

 

The point is I have used this construct several times in this very design (register bank, startup ROM, etc.) and in many other designs with this and previous versions of Quartus and this is the first time I see this problem... 

 

I have to find a way to put a block ram there. I will try to put the block in a separate module (which will complicate the code unnecessarily, but oh well...). Eventually I may have to surrender and just instantiate a Cyclone-2 block (so far the code is vendor agnostic and works with no changes on Xilinx tools).  

 

 

 

On an unrelated note, I have been unable to install and run Quartus on my Kubuntu machine. Installation errors, missing instructions, program crashes, etc. etc. etc. In my opinion, the Linux version of Quartus-2 is so desperately buggy that Altera should just honestly admit that Linux is not supported -- supporting a small percentage of Linux users on an essentially random basis is not enough. Compare this to Eclipse, Kdevelop, CodeBlocks, OpenOffice, etc.  

 

 

 

Thank you for your tips!
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

its quite stable on supported distros ;) 

 

even on some unsupported...
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Regarding the RAM inference problem, as we haven't seen the real code, it's only guessing. The requirements for RAM inference are basically simple, but depending on the coding style, it may be more difficult, to see if they are met. In case of doubt, I would use the Altera RAM templates as a starting point. 

 

I can't exclude of course, that there may be a Quartus problem, that prevents RAM inference in cases, that are expected to work. But I'm not aware of any, yet.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

I have tried moving the RAM block to a separate module and I get the same result. Synthesizing the module standalone gives the expected result, synthesizing it in my design gives the same error as before. 

 

The separate module in its entirety is this: 

 

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; use work.mips_pkg.all; entity bram_2p is generic ( BRAM_SIZE : integer := 1024; BRAM_WIDTH : integer := 16 ); port( clk : in std_logic; reset : in std_logic; rd_addr : in std_logic_vector(9 downto 0);--in std_logic_vector(log2(BRAM_SIZE)-1 downto 0); rd_data : out std_logic_vector(BRAM_WIDTH-1 downto 0); wr_addr : in std_logic_vector(9 downto 0); wr_data : in std_logic_vector(BRAM_WIDTH-1 downto 0); we : in std_logic ); end entity bram_2p; architecture inferred of bram_2p is type t_ram is array(BRAM_SIZE-1 downto 0) of std_logic_vector(BRAM_WIDTH-1 downto 0); signal ram : t_ram; begin memory: process(clk) begin if clk'event and clk='1' then if we='1' then ram(conv_integer(wr_addr)) <= wr_data; end if; rd_data <= ram(conv_integer(rd_addr)); end if; end process memory; end architecture inferred; I have tried it with and without generics, the code is a bit dirty as the result of a lot of aimless changes. 

 

The code is nearly identical to all the Altera templates I have been able to find, and identical to the code that I have been using with no problem in other projects. 

 

There definitely is something in my code that is confusing the synthesizer but I can't find what it is. 

 

 

I have a question for any Altera engineers thay may be reading this. Is there an app note or document that explains exactly how inference works? So I can guess under which circumstances it will fail, despite using the Altera template? 

I have tried enabling all the error message levels that I have been able to find in the IDE but the only error message is this: 

 

Info: Found 1 instances of uninferred RAM logic Info: RAM logic "mips_cache:cache|bram_2p:code_line_memory|ram" is uninferred due to asynchronous read logic  

 

There is no error message involving any of the ram interface signals. 

 

At this point I think we can call this behavior a bug, in fairness. All I want is to find some way to work around it that does not involve instantiating an architecture-dependent module.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

I have discovered a fix: 

 

If I pass the RAM output through a layer of logic (I have tried an AND gate and a MUX, both implemented as a single LUT I think) then the synthesis works as expected. I guess I should have tried this a lot earlier... 

 

This is a fix I can happily live with so I guess the problem is solved. Yet, I would like to know about that documentation I mentioned in my previous post; knowing how the synthesis works would help me prevent this kind of trouble in the future. 

 

Thank you very much for your help. If you think of some other thing that you'd like me to try, tell me and I will. Or if any of you Altera guys reading this wants the project code to test this issue I can give it to you (with some mild embarrassment about its quality).
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Did you notice, that the Quartus template is using a different topology than your design? It's registering the read address rather than the output. This structure is corresponding to the actual hardware behaviour. Normally, Quartus is able to convert the output register you have defined in your code into an input register, but apparently this doesn't work in your full design. I guess, that the compiler pushes the register into the design part driven by the RAM before it starts to process the RAM part.

0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Hello, 

 

As you suggested, I have just tried registering the read address instead of the output, but it does not help.  

I had seen that construct you mention suggested in the manual (example 10-4 in the handbook, section on recommended hdl coding styles), so this is one of the things that I tried previously. Yet, the template in Quartus-2 'insert template' menu (and in some other Altera manuals that I don't have on hand right now) matches the construct I have used. I will for the sake of consistency refactor the RAMs to use the recommended construct (it is compatible with ISE too) but as I said it does not help in this case. 

 

I think you are right in that some of the logic connected to the RAM output somehow gets mixed in with the RAM construct and upsets the 'pattern' recognized by the synthesis tool. Putting the extra layer of logic I mentioned in my previous post prevents that 'mixing'. 

If there is a synthesis option to prevent optimization across hierarchy boundaries, it might help. I haven't found it in the IDE or the manual (yet). 

 

Thank you very much for your help!
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Hello again, 

 

In my original code (the code that failed) the uninferred RAM data output was connected straight to the address input of a different RAM block. Can this be the source of the trouble? The problem disappeared when I inserted a layer of logic between data output and address input. 

 

This is the only difference I can think of between this and many other designs in which I used this inferred RAM construct witl Quartus-2. I don't remember seeing this case mentioned in the manual but I'll have to check again more carefully... 

 

I will conduct an experiment and post the result ASAP.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

 

--- Quote Start ---  

In my original code (the code that failed) the uninferred RAM data output was connected straight to the address input of a different RAM block. Can this be the source of the trouble? 

--- Quote End ---  

 

Yes, that's sounds plaubsible. I guess, it's a somewhat unusual connection, so it most likely hasn't been considered by Altera. 

 

The reported behaviour of not inferring a RAM although the conditions are apparently met seems like a bug to me. You should also tell to Altera support.
0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

In Quartus, from the Assignments menu, select "Design Partitions Window". Then from the Project Navigator window find all instances of your RAM and either right click to create a partition around it or click and drag it to the Design Partitions Window. If the code infers into memory when compiled at the top level, then I am pretty sure this will do the trick.

0 Kudos
Altera_Forum
Honored Contributor II
3,889 Views

Yes, I have seen this exact circumstance - a register for the output of one RAM is the address register for another. QII synthesis makes an early decision of which register to pull in to each RAM to make it synchronous. Unfortunately, this is done independently for each RAM, so there's nothing to prevent the tool from choosing the same register for each RAM which then results in one RAM or the other not being inferred correctly. Does the second RAM have output registers in the RTL? I believe QII prefers to use the output register if available to make the RAM synchronous, so if you add the output register, it may stop the tool from trying to choose the read address register (which is the same as the output register already being chosen by the first RAM).

0 Kudos
Reply