- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have the following VHDL file to infer a true dual port, single clock block RAM. I believe that it is according to Altera's guidelines. I have Quartus 16.02 Build 222 07/20/2016. When I use this file in isolation, I get a block RAM (what I want). When I use it in my design, it generates heaps of ALMs instead and screws up timing as well. I tried to make it into a two clocks, two enables RAM and connect to the same clock and enable both pins, but same behavior. I also tried to make the number of locations a power of 2 and same crappy result. I have other similar files causing identical problems. Any clues ? Thank you very much in advance.library IEEE;
use IEEE.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity ram_160x32_dp is port(
-- Inputs
clk : in std_logic; -- Clock
en : in std_logic; -- Sync enable
we0, we1 : in std_logic; -- Write enables
addr0, addr1 : in std_logic_vector(7 downto 0); -- Addresess
din0, din1 : in std_logic_vector(31 downto 0); -- Data input
dout0, dout1 : out std_logic_vector(31 downto 0) -- Data output
);
end ram_160x32_dp;
architecture beh of ram_160x32_dp is
type mram is array (159 downto 0) of std_logic_vector(31 downto 0);
signal mem : mram; -- Memory array 160x32
begin
--Dual port RAM model
process(clk)
variable n : integer;
begin
if rising_edge(clk) then
-- Memory is written and read synchronoulsy
-- when enabled
if en = '1' then
n:=conv_integer(addr0);
-- Write port 0
if we0 = '1' then
mem(n) <= din0;
dout0 <= din0;
else
-- Read port 0
dout0 <= mem(n);
end if;
end if;
end if;
end process;
process(clk)
variable n : integer;
begin
if rising_edge(clk) then
-- Memory is written and read synchronoulsy
-- when enabled
if en = '1' then
n:=conv_integer(addr1);
-- Write port 1
if we1 = '1' then
mem(n) <= din1;
dout1 <= din1;
else
-- Read port 1
dout1 <= mem(n);
end if;
end if;
end if;
end process;
end beh;
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is probably the fact you have the write enable blocking the read side the of ram from occuring. Stop making doutN depend on the write enable.
Also, rams do not have a global enable, so remove this. use the enable externally to the ram. edit: actually, its the passthrough you have designed. There is no passthrough on the rams. If you want a write-before read behaviour then use a shared variable for the ram, rather than signal.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the hint, I will try to act on your suggestions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In "Recommended HDL Coding Styles" from Nov 2013, Example 13-18 pg 13-27, the code is almost exactly the same as above and it says that it "maps directly into Altera
synchronous memory". It also contains an error : it declares the memory array as a shared variable instead of a signal, but it assigns it as a signal '<='. That generates a compiler error, but more importantly is even more confusing. I would like the RAM behavior that is the fastest and closest to the Altera's silicon without the need of additional logic.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This does appear to be a documentation issue - it should be :=
Yes, I note your example is basically the same, other than your use of signal and their use of shared variable. What version of quartus are you using - have you tried using a newer version? That is the handbook for Q13 which is 4 years old. If that still doesnt work then When you say "fastest" what exactly do you mean? This solution will give you the lowest latency but will give the lowest FMax. For fastest fmax you need to register the output data as well.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use Quartus 16.02 Build 222 07/20/2016.
The most annoying thing is that it produces different results in isolation and within a design. This is a very old design that worked in CycloneIII before. I'm trying to revive it in CycloneV. I am almost certain that I would have not needed to use the new data being written, otherwise I would have built bypass myself. Also, the design logic does not expect the output of the RAM be registered. As for "fastest" I mean the one that is closest to Altera's block RAM actual hardware without any additional fabric logic (i.e. for bypass).- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't know if this is what you want to do, but to get the most optimal results in hardware without having to futz with code, you could implement the RAM using the block RAM IP core.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One possible issue is that 160 is not a power of 2 and does not match your address bit width. The VHDL template in Quartus 16.1 seems to require that and the tool may get confused otherwise. But you say it worked before. The template also uses a natural with the proper range (0 to 255 in your case) instead of an integer.
Also, under "Settings => Compiler Setting => Advanced Settings", you might want to make sure "Allow Any RAM Size for Regognition" is ON. In the case of ROM, you have to make sure "Assignments -> Device ... -> Device and Pin Options ... -> Configuration Mode" is set to: "Single Uncompressed Image with Memory Initialization ..." I doubt if that would affect RAM though. As sstrell mentions, you could use the Megawizard. The problem with wizards is they produce unmaintainable code. If you ever change something, you have to generate a wizard in parallel. Unfortunately, the Microsoft "everything should be a wizard" philosophy has infected the entire software world. But there is a compromise solution that I tend to use. Generate the IP once using the wizard and then create a package with generics for the items of interest from the VHDL code generated by the wizard. For example, I have a dual-port RAM in my own code that looks like below. The entity is created by the wizard and I replace the hardwired numbers of the original with generics. It's not good for ROM, but for everything else it is MUCH easier than trying to deal with persnickety inference tools.library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
package timeslot_table_comp is
component timeslot_table is
generic (
ADDR_BITS_A : natural := 10;
ADDR_BITS_B : natural := 10;
WIDTH_A : natural := 32;
WIDTH_B : natural := 32
);
PORT (
address_a : IN unsigned (ADDR_BITS_A - 1 DOWNTO 0);
address_b : IN unsigned (ADDR_BITS_B - 1 DOWNTO 0);
clock : IN STD_LOGIC := '1';
data_a : IN STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0);
data_b : IN STD_LOGIC_VECTOR (WIDTH_B - 1 DOWNTO 0);
rden_a : IN STD_LOGIC := '1';
rden_b : IN STD_LOGIC := '1';
wren_a : IN STD_LOGIC := '0';
wren_b : IN STD_LOGIC := '0';
q_a : OUT STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0);
q_b : OUT unsigned (WIDTH_B - 1 DOWNTO 0)
);
end component;
end timeslot_table_comp;
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use WORK.TIMESLOT_TABLE_COMP.ALL;
LIBRARY altera_mf;
USE altera_mf.altera_mf_components.all;
ENTITY timeslot_table IS
generic (
ADDR_BITS_A : natural := 10;
ADDR_BITS_B : natural := 10;
WIDTH_A : natural := 32;
WIDTH_B : natural := 32
);
PORT (
address_a : IN unsigned (ADDR_BITS_A - 1 DOWNTO 0);
address_b : IN unsigned (ADDR_BITS_B - 1 DOWNTO 0);
clock : IN STD_LOGIC := '1';
data_a : IN STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0);
data_b : IN STD_LOGIC_VECTOR (WIDTH_B - 1 DOWNTO 0);
rden_a : IN STD_LOGIC := '1';
rden_b : IN STD_LOGIC := '1';
wren_a : IN STD_LOGIC := '0';
wren_b : IN STD_LOGIC := '0';
q_a : OUT STD_LOGIC_VECTOR (WIDTH_A - 1 DOWNTO 0);
q_b : OUT unsigned (WIDTH_B - 1 DOWNTO 0)
);
END timeslot_table;
ARCHITECTURE SYN OF timeslot_table IS
SIGNAL sub_wire0 : STD_LOGIC_VECTOR (31 DOWNTO 0);
SIGNAL sub_wire1 : STD_LOGIC_VECTOR (31 DOWNTO 0);
BEGIN
q_a <= sub_wire0(WIDTH_A - 1 DOWNTO 0);
q_b <= unsigned(sub_wire1(WIDTH_B - 1 DOWNTO 0));
altsyncram_component : altsyncram
GENERIC MAP (
address_reg_b => "CLOCK0",
clock_enable_input_a => "BYPASS",
clock_enable_input_b => "BYPASS",
clock_enable_output_a => "BYPASS",
clock_enable_output_b => "BYPASS",
indata_reg_b => "CLOCK0",
intended_device_family => "MAX 10",
lpm_type => "altsyncram",
numwords_a => 2**ADDR_BITS_A,
numwords_b => 2**ADDR_BITS_B,
operation_mode => "BIDIR_DUAL_PORT",
outdata_aclr_a => "NONE",
outdata_aclr_b => "NONE",
outdata_reg_a => "UNREGISTERED",
outdata_reg_b => "UNREGISTERED",
power_up_uninitialized => "FALSE",
read_during_write_mode_mixed_ports => "DONT_CARE",
read_during_write_mode_port_a => "NEW_DATA_WITH_NBE_READ",
read_during_write_mode_port_b => "NEW_DATA_WITH_NBE_READ",
widthad_a => ADDR_BITS_A,
widthad_b => ADDR_BITS_B,
width_a => WIDTH_A,
width_b => WIDTH_B,
width_byteena_a => 1,
width_byteena_b => 1,
wrcontrol_wraddress_reg_b => "CLOCK0"
)
PORT MAP (
address_a => std_logic_vector(address_a),
address_b => std_logic_vector(address_b),
clock0 => clock,
data_a => data_a,
data_b => data_b,
rden_a => rden_a,
rden_b => rden_b,
wren_a => wren_a,
wren_b => wren_b,
q_a => sub_wire0,
q_b => sub_wire1
);
END SYN;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- But there is a compromise solution that I tend to use. Generate the IP once using the wizard and then create a package with generics for the items of interest from the VHDL code generated by the wizard. For example, I have a dual-port RAM in my own code that looks like below. The entity is created by the wizard and I replace the hardwired numbers of the original with generics. It's not good for ROM, but for everything else it is MUCH easier than trying to deal with persnickety inference tools. --- Quote End --- This is a bit of a messy way to do things. Why not just instantiate the altsyncram directly in your code? You can get all the generics and ports from here: https://www.altera.com/content/dam/altera-www/global/en_us/pdfs/literature/ug/ug_ram.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tricky,
Beauty is in the eye of the beholder I suppose. In some sense that is what I'm doing. The defaults for say altsynram from altera_mf_components.vhd are:component altsyncram
generic (
address_aclr_a : string := "UNUSED";
address_aclr_b : string := "NONE";
address_reg_b : string := "CLOCK1";
byte_size : natural := 8;
byteena_aclr_a : string := "UNUSED";
byteena_aclr_b : string := "NONE";
byteena_reg_b : string := "CLOCK1";
clock_enable_core_a : string := "USE_INPUT_CLKEN";
clock_enable_core_b : string := "USE_INPUT_CLKEN";
clock_enable_input_a : string := "NORMAL";
clock_enable_input_b : string := "NORMAL";
clock_enable_output_a : string := "NORMAL";
clock_enable_output_b : string := "NORMAL";
intended_device_family : string := "unused";
ecc_pipeline_stage_enabled : string := "FALSE";
enable_ecc : string := "FALSE";c
implement_in_les : string := "OFF";
indata_aclr_a : string := "UNUSED";
indata_aclr_b : string := "NONE";
indata_reg_b : string := "CLOCK1";
init_file : string := "UNUSED";
init_file_layout : string := "PORT_A";
maximum_depth : natural := 0;
numwords_a : natural := 0;
numwords_b : natural := 0;
operation_mode : string := "BIDIR_DUAL_PORT";
outdata_aclr_a : string := "NONE";
outdata_aclr_b : string := "NONE";
outdata_reg_a : string := "UNREGISTERED";
outdata_reg_b : string := "UNREGISTERED";
power_up_uninitialized : string := "FALSE";
ram_block_type : string := "AUTO";
rdcontrol_aclr_b : string := "NONE";
rdcontrol_reg_b : string := "CLOCK1";
read_during_write_mode_mixed_ports : string := "DONT_CARE";
read_during_write_mode_port_a : string := "NEW_DATA_NO_NBE_READ";
read_during_write_mode_port_b : string := "NEW_DATA_NO_NBE_READ";
stratixiv_m144k_allow_dual_clocks : string := "ON";
width_a : natural;
width_b : natural := 1;
width_byteena_a : natural := 1;
width_byteena_b : natural := 1;
width_eccstatus : natural := 3;
widthad_a : natural;
widthad_b : natural := 1;
wrcontrol_aclr_a : string := "UNUSED";
wrcontrol_aclr_b : string := "NONE";
wrcontrol_wraddress_reg_b : string := "CLOCK1";
lpm_hint : string := "UNUSED";
lpm_type : string := "altsyncram"
);
port(
aclr0 : in std_logic := '0';
aclr1 : in std_logic := '0';
address_a : in std_logic_vector(widthad_a-1 downto 0);
address_b : in std_logic_vector(widthad_b-1 downto 0) := (others => '1');
addressstall_a : in std_logic := '0';
addressstall_b : in std_logic := '0';
byteena_a : in std_logic_vector(width_byteena_a-1 downto 0) := (others => '1');
byteena_b : in std_logic_vector(width_byteena_b-1 downto 0) := (others => '1');
clock0 : in std_logic := '1';
clock1 : in std_logic := '1';
clocken0 : in std_logic := '1';
clocken1 : in std_logic := '1';
clocken2 : in std_logic := '1';
clocken3 : in std_logic := '1';
data_a : in std_logic_vector(width_a-1 downto 0) := (others => '1');
data_b : in std_logic_vector(width_b-1 downto 0) := (others => '1');
eccstatus : out std_logic_vector(width_eccstatus-1 downto 0);
q_a : out std_logic_vector(width_a-1 downto 0);
q_b : out std_logic_vector(width_b-1 downto 0);
rden_a : in std_logic := '1';
rden_b : in std_logic := '1';
wren_a : in std_logic := '0';
wren_b : in std_logic := '0'
);
end component;
The code generated by the Wizard sets generics that differ from the default plus a few others. In my case:
altsyncram_component : altsyncram
GENERIC MAP (
address_reg_b => "CLOCK0",
clock_enable_input_a => "BYPASS",
clock_enable_input_b => "BYPASS",
clock_enable_output_a => "BYPASS",
clock_enable_output_b => "BYPASS",
indata_reg_b => "CLOCK0",
intended_device_family => "MAX 10",
lpm_type => "altsyncram",
numwords_a => 512,
numwords_b => 512,
operation_mode => "BIDIR_DUAL_PORT",
outdata_aclr_a => "NONE",
outdata_aclr_b => "NONE",
outdata_reg_a => "UNREGISTERED",
outdata_reg_b => "UNREGISTERED",
power_up_uninitialized => "FALSE",
read_during_write_mode_mixed_ports => "DONT_CARE",
read_during_write_mode_port_a => "NEW_DATA_WITH_NBE_READ",
read_during_write_mode_port_b => "NEW_DATA_WITH_NBE_READ",
widthad_a => 9,
widthad_b => 9,
width_a => 32,
width_b => 32,
width_byteena_a => 1,
width_byteena_b => 1,
wrcontrol_wraddress_reg_b => "CLOCK0"
)
PORT MAP (
address_a => std_logic_vector(address_a),
address_b => std_logic_vector(address_b),
clock0 => clock,
data_a => data_a,
data_b => data_b,
rden_a => rden_a,
rden_b => rden_b,
wren_a => wren_a,
wren_b => wren_b,
q_a => sub_wire0,
q_b => sub_wire1
);
I would find directly instantiating either one in my code (especially multiple times) to be quite ugly compared to: timeslot_table1 : timeslot_table
generic map (
ADDR_BITS_A => TS_ADDR_BITS,
ADDR_BITS_B => TS_ADDR_BITS,
WIDTH_A => 32,
WIDTH_B => 32
)
PORT MAP (
address_a => indexa,
address_b => addrb,
clock => CLK,
data_a => dina,
data_b => (others => '0'),
rden_a => RE_a,
rden_b => RE_b,
wren_a => WE_a,
wren_b => '0',
q_a => douta,
q_b => doutb
);
It also lets me do mapping of ports to my preferred type (eg addresses are unsigned instead of std_logic_vector in this case). Your point is well taken though. To each his own.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your suggestions : I have reached similar conclusion, instantiate altsyncram directly. A bit painful, but at least I don't have to guess what an ever changing synthesis tool is going to do.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page