Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
1,215 Views

Same VHDL block RAM inferring file, different results

Hi, 

 

I have the following VHDL file to infer a true dual port, single clock block RAM. I believe that it is according to Altera's guidelines. 

I have Quartus 16.02 Build 222 07/20/2016. 

When I use this file in isolation, I get a block RAM (what I want). When I use it in my design, it generates heaps of ALMs instead 

and screws up timing as well. 

I tried to make it into a two clocks, two enables RAM and connect to the same clock and enable both pins, but same behavior. 

I also tried to make the number of locations a power of 2 and same crappy result. 

I have other similar files causing identical problems. 

Any clues ? 

 

Thank you very much in advance. 

 

library IEEE; use IEEE.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity ram_160x32_dp is port( -- Inputs clk : in std_logic; -- Clock en : in std_logic; -- Sync enable we0, we1 : in std_logic; -- Write enables addr0, addr1 : in std_logic_vector(7 downto 0); -- Addresess din0, din1 : in std_logic_vector(31 downto 0); -- Data input dout0, dout1 : out std_logic_vector(31 downto 0) -- Data output ); end ram_160x32_dp; architecture beh of ram_160x32_dp is type mram is array (159 downto 0) of std_logic_vector(31 downto 0); signal mem : mram; -- Memory array 160x32 begin --Dual port RAM model process(clk) variable n : integer; begin if rising_edge(clk) then -- Memory is written and read synchronoulsy -- when enabled if en = '1' then n:=conv_integer(addr0); -- Write port 0 if we0 = '1' then mem(n) <= din0; dout0 <= din0; else -- Read port 0 dout0 <= mem(n); end if; end if; end if; end process; process(clk) variable n : integer; begin if rising_edge(clk) then -- Memory is written and read synchronoulsy -- when enabled if en = '1' then n:=conv_integer(addr1); -- Write port 1 if we1 = '1' then mem(n) <= din1; dout1 <= din1; else -- Read port 1 dout1 <= mem(n); end if; end if; end if; end process; end beh;
0 Kudos
10 Replies
Altera_Forum
Honored Contributor I
52 Views

It is probably the fact you have the write enable blocking the read side the of ram from occuring. Stop making doutN depend on the write enable. 

Also, rams do not have a global enable, so remove this. use the enable externally to the ram. 

 

edit: actually, its the passthrough you have designed. There is no passthrough on the rams. If you want a write-before read behaviour then use a shared variable for the ram, rather than signal.
Altera_Forum
Honored Contributor I
52 Views

Thanks for the hint, I will try to act on your suggestions.

Altera_Forum
Honored Contributor I
52 Views

In "Recommended HDL Coding Styles" from Nov 2013, Example 13-18 pg 13-27, the code is almost exactly the same as above and it says that it "maps directly into Altera 

synchronous memory". 

It also contains an error : it declares the memory array as a shared variable instead of a signal, but it assigns it as a signal '<='. That generates a compiler error, but more importantly is even more confusing. 

I would like the RAM behavior that is the fastest and closest to the Altera's silicon without the need of additional logic.
Altera_Forum
Honored Contributor I
52 Views

This does appear to be a documentation issue - it should be := 

 

Yes, I note your example is basically the same, other than your use of signal and their use of shared variable. What version of quartus are you using - have you tried using a newer version? That is the handbook for Q13 which is 4 years old. 

If that still doesnt work then  

 

When you say "fastest" what exactly do you mean? This solution will give you the lowest latency but will give the lowest FMax. For fastest fmax you need to register the output data as well.
Altera_Forum
Honored Contributor I
52 Views

I use Quartus 16.02 Build 222 07/20/2016. 

The most annoying thing is that it produces different results in isolation and within a design. 

This is a very old design that worked in CycloneIII before. I'm trying to revive it in CycloneV. I am almost certain that I would have not needed to use the new data being written, otherwise I would have built bypass myself. Also, the design logic does not expect the output of the RAM be registered. 

 

As for "fastest" I mean the one that is closest to Altera's block RAM actual hardware without any additional fabric logic (i.e. for bypass).
Altera_Forum
Honored Contributor I
52 Views

I don't know if this is what you want to do, but to get the most optimal results in hardware without having to futz with code, you could implement the RAM using the block RAM IP core.

Altera_Forum
Honored Contributor I
52 Views

One possible issue is that 160 is not a power of 2 and does not match your address bit width. The VHDL template in Quartus 16.1 seems to require that and the tool may get confused otherwise. But you say it worked before. The template also uses a natural with the proper range (0 to 255 in your case) instead of an integer.  

 

Also, under "Settings => Compiler Setting => Advanced Settings", you might want to make sure "Allow Any RAM Size for Regognition" is ON.  

 

In the case of ROM, you have to make sure "Assignments -> Device ... -> Device and Pin Options ... -> Configuration Mode" is set to: 

 

"Single Uncompressed Image with Memory Initialization ..." 

 

I doubt if that would affect RAM though. 

 

As sstrell mentions, you could use the Megawizard. The problem with wizards is they produce unmaintainable code. If you ever change something, you have to generate a wizard in parallel. Unfortunately, the Microsoft "everything should be a wizard" philosophy has infected the entire software world.  

 

But there is a compromise solution that I tend to use. Generate the IP once using the wizard and then create a package with generics for the items of interest from the VHDL code generated by the wizard. For example, I have a dual-port RAM in my own code that looks like below. The entity is created by the wizard and I replace the hardwired numbers of the original with generics. It's not good for ROM, but for everything else it is MUCH easier than trying to deal with persnickety inference tools.  

 

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; package timeslot_table_comp is component timeslot_table is generic ( ADDR_BITS_A : natural := 10; ADDR_BITS_B : natural := 10; WIDTH_A : natural := 32; WIDTH_B : natural := 32 ); PORT ( address_a : IN unsigned (ADDR_BITS_A - 1 DOWNTO 0); address_b : IN unsigned (ADDR_BITS_B - 1 DOWNTO 0); clock : IN STD_LOGIC := '1'; data_a : IN STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0); data_b : IN STD_LOGIC_VECTOR (WIDTH_B - 1 DOWNTO 0); rden_a : IN STD_LOGIC := '1'; rden_b : IN STD_LOGIC := '1'; wren_a : IN STD_LOGIC := '0'; wren_b : IN STD_LOGIC := '0'; q_a : OUT STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0); q_b : OUT unsigned (WIDTH_B - 1 DOWNTO 0) ); end component; end timeslot_table_comp; library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; use WORK.TIMESLOT_TABLE_COMP.ALL; LIBRARY altera_mf; USE altera_mf.altera_mf_components.all; ENTITY timeslot_table IS generic ( ADDR_BITS_A : natural := 10; ADDR_BITS_B : natural := 10; WIDTH_A : natural := 32; WIDTH_B : natural := 32 ); PORT ( address_a : IN unsigned (ADDR_BITS_A - 1 DOWNTO 0); address_b : IN unsigned (ADDR_BITS_B - 1 DOWNTO 0); clock : IN STD_LOGIC := '1'; data_a : IN STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0); data_b : IN STD_LOGIC_VECTOR (WIDTH_B - 1 DOWNTO 0); rden_a : IN STD_LOGIC := '1'; rden_b : IN STD_LOGIC := '1'; wren_a : IN STD_LOGIC := '0'; wren_b : IN STD_LOGIC := '0'; q_a : OUT STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0); q_b : OUT unsigned (WIDTH_B - 1 DOWNTO 0) ); END timeslot_table; ARCHITECTURE SYN OF timeslot_table IS SIGNAL sub_wire0 : STD_LOGIC_VECTOR (31 DOWNTO 0); SIGNAL sub_wire1 : STD_LOGIC_VECTOR (31 DOWNTO 0); BEGIN q_a <= sub_wire0(WIDTH_A - 1 DOWNTO 0); q_b <= unsigned(sub_wire1(WIDTH_B - 1 DOWNTO 0)); altsyncram_component : altsyncram GENERIC MAP ( address_reg_b => "CLOCK0", clock_enable_input_a => "BYPASS", clock_enable_input_b => "BYPASS", clock_enable_output_a => "BYPASS", clock_enable_output_b => "BYPASS", indata_reg_b => "CLOCK0", intended_device_family => "MAX 10", lpm_type => "altsyncram", numwords_a => 2**ADDR_BITS_A, numwords_b => 2**ADDR_BITS_B, operation_mode => "BIDIR_DUAL_PORT", outdata_aclr_a => "NONE", outdata_aclr_b => "NONE", outdata_reg_a => "UNREGISTERED", outdata_reg_b => "UNREGISTERED", power_up_uninitialized => "FALSE", read_during_write_mode_mixed_ports => "DONT_CARE", read_during_write_mode_port_a => "NEW_DATA_WITH_NBE_READ", read_during_write_mode_port_b => "NEW_DATA_WITH_NBE_READ", widthad_a => ADDR_BITS_A, widthad_b => ADDR_BITS_B, width_a => WIDTH_A, width_b => WIDTH_B, width_byteena_a => 1, width_byteena_b => 1, wrcontrol_wraddress_reg_b => "CLOCK0" ) PORT MAP ( address_a => std_logic_vector(address_a), address_b => std_logic_vector(address_b), clock0 => clock, data_a => data_a, data_b => data_b, rden_a => rden_a, rden_b => rden_b, wren_a => wren_a, wren_b => wren_b, q_a => sub_wire0, q_b => sub_wire1 ); END SYN;
Altera_Forum
Honored Contributor I
52 Views

 

--- Quote Start ---  

 

But there is a compromise solution that I tend to use. Generate the IP once using the wizard and then create a package with generics for the items of interest from the VHDL code generated by the wizard. For example, I have a dual-port RAM in my own code that looks like below. The entity is created by the wizard and I replace the hardwired numbers of the original with generics. It's not good for ROM, but for everything else it is MUCH easier than trying to deal with persnickety inference tools.  

 

 

--- Quote End ---  

 

 

This is a bit of a messy way to do things. Why not just instantiate the altsyncram directly in your code? You can get all the generics and ports from here: 

https://www.altera.com/content/dam/altera-www/global/en_us/pdfs/literature/ug/ug_ram.pdf
Altera_Forum
Honored Contributor I
52 Views

Tricky, 

 

Beauty is in the eye of the beholder I suppose. In some sense that is what I'm doing. The defaults for say altsynram from altera_mf_components.vhd are: 

 

component altsyncram generic ( address_aclr_a : string := "UNUSED"; address_aclr_b : string := "NONE"; address_reg_b : string := "CLOCK1"; byte_size : natural := 8; byteena_aclr_a : string := "UNUSED"; byteena_aclr_b : string := "NONE"; byteena_reg_b : string := "CLOCK1"; clock_enable_core_a : string := "USE_INPUT_CLKEN"; clock_enable_core_b : string := "USE_INPUT_CLKEN"; clock_enable_input_a : string := "NORMAL"; clock_enable_input_b : string := "NORMAL"; clock_enable_output_a : string := "NORMAL"; clock_enable_output_b : string := "NORMAL"; intended_device_family : string := "unused"; ecc_pipeline_stage_enabled : string := "FALSE"; enable_ecc : string := "FALSE";c implement_in_les : string := "OFF"; indata_aclr_a : string := "UNUSED"; indata_aclr_b : string := "NONE"; indata_reg_b : string := "CLOCK1"; init_file : string := "UNUSED"; init_file_layout : string := "PORT_A"; maximum_depth : natural := 0; numwords_a : natural := 0; numwords_b : natural := 0; operation_mode : string := "BIDIR_DUAL_PORT"; outdata_aclr_a : string := "NONE"; outdata_aclr_b : string := "NONE"; outdata_reg_a : string := "UNREGISTERED"; outdata_reg_b : string := "UNREGISTERED"; power_up_uninitialized : string := "FALSE"; ram_block_type : string := "AUTO"; rdcontrol_aclr_b : string := "NONE"; rdcontrol_reg_b : string := "CLOCK1"; read_during_write_mode_mixed_ports : string := "DONT_CARE"; read_during_write_mode_port_a : string := "NEW_DATA_NO_NBE_READ"; read_during_write_mode_port_b : string := "NEW_DATA_NO_NBE_READ"; stratixiv_m144k_allow_dual_clocks : string := "ON"; width_a : natural; width_b : natural := 1; width_byteena_a : natural := 1; width_byteena_b : natural := 1; width_eccstatus : natural := 3; widthad_a : natural; widthad_b : natural := 1; wrcontrol_aclr_a : string := "UNUSED"; wrcontrol_aclr_b : string := "NONE"; wrcontrol_wraddress_reg_b : string := "CLOCK1"; lpm_hint : string := "UNUSED"; lpm_type : string := "altsyncram" ); port( aclr0 : in std_logic := '0'; aclr1 : in std_logic := '0'; address_a : in std_logic_vector(widthad_a-1 downto 0); address_b : in std_logic_vector(widthad_b-1 downto 0) := (others => '1'); addressstall_a : in std_logic := '0'; addressstall_b : in std_logic := '0'; byteena_a : in std_logic_vector(width_byteena_a-1 downto 0) := (others => '1'); byteena_b : in std_logic_vector(width_byteena_b-1 downto 0) := (others => '1'); clock0 : in std_logic := '1'; clock1 : in std_logic := '1'; clocken0 : in std_logic := '1'; clocken1 : in std_logic := '1'; clocken2 : in std_logic := '1'; clocken3 : in std_logic := '1'; data_a : in std_logic_vector(width_a-1 downto 0) := (others => '1'); data_b : in std_logic_vector(width_b-1 downto 0) := (others => '1'); eccstatus : out std_logic_vector(width_eccstatus-1 downto 0); q_a : out std_logic_vector(width_a-1 downto 0); q_b : out std_logic_vector(width_b-1 downto 0); rden_a : in std_logic := '1'; rden_b : in std_logic := '1'; wren_a : in std_logic := '0'; wren_b : in std_logic := '0' ); end component;  

 

The code generated by the Wizard sets generics that differ from the default plus a few others. In my case: 

 

altsyncram_component : altsyncram GENERIC MAP ( address_reg_b => "CLOCK0", clock_enable_input_a => "BYPASS", clock_enable_input_b => "BYPASS", clock_enable_output_a => "BYPASS", clock_enable_output_b => "BYPASS", indata_reg_b => "CLOCK0", intended_device_family => "MAX 10", lpm_type => "altsyncram", numwords_a => 512, numwords_b => 512, operation_mode => "BIDIR_DUAL_PORT", outdata_aclr_a => "NONE", outdata_aclr_b => "NONE", outdata_reg_a => "UNREGISTERED", outdata_reg_b => "UNREGISTERED", power_up_uninitialized => "FALSE", read_during_write_mode_mixed_ports => "DONT_CARE", read_during_write_mode_port_a => "NEW_DATA_WITH_NBE_READ", read_during_write_mode_port_b => "NEW_DATA_WITH_NBE_READ", widthad_a => 9, widthad_b => 9, width_a => 32, width_b => 32, width_byteena_a => 1, width_byteena_b => 1, wrcontrol_wraddress_reg_b => "CLOCK0" ) PORT MAP ( address_a => std_logic_vector(address_a), address_b => std_logic_vector(address_b), clock0 => clock, data_a => data_a, data_b => data_b, rden_a => rden_a, rden_b => rden_b, wren_a => wren_a, wren_b => wren_b, q_a => sub_wire0, q_b => sub_wire1 );  

 

I would find directly instantiating either one in my code (especially multiple times) to be quite ugly compared to: 

 

timeslot_table1 : timeslot_table generic map ( ADDR_BITS_A => TS_ADDR_BITS, ADDR_BITS_B => TS_ADDR_BITS, WIDTH_A => 32, WIDTH_B => 32 ) PORT MAP ( address_a => indexa, address_b => addrb, clock => CLK, data_a => dina, data_b => (others => '0'), rden_a => RE_a, rden_b => RE_b, wren_a => WE_a, wren_b => '0', q_a => douta, q_b => doutb );  

 

It also lets me do mapping of ports to my preferred type (eg addresses are unsigned instead of std_logic_vector in this case).  

 

Your point is well taken though. To each his own.
Altera_Forum
Honored Contributor I
52 Views

Thank you for your suggestions : I have reached similar conclusion, instantiate altsyncram directly. A bit painful, but at least I don't have to guess what an ever changing synthesis tool is going to do.

Reply