- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi. I'm trying to write a simple Avalon-MM master component which writes an image to the DRAM memory. I use this component in a Qsys system with Nios II/e processor, SDRAM controller and University Program video cores. I'm trying this on DE2-115 board (Cyclone IV EP4CE115F29C7, 50 MHz clock) connected to a monitor via VGA. The SDRAM memory and the contoller, and my component are driven by 167 MHz clock generated by a PLL. The display part consists of UP video cores; DMA which reads the 800x600 8-bit grayscale image and VGA controller.
My component source is here https://gist.github.com/woky/a9a02ac03e5ccd23b821262d0c607255. (It's also below but gist has line numbers). The component is either waiting for arrival of an address on ctl interface or writing an image to the address received on the ctl interface. The image is just black top half and white bottom half. In main() in my Nios program I just allocate memory via malloc() and write its address into the UP video DMA and my component. Please ignore debug_* signals, they're just for debugging purposes (displaying state on 7 seg displays and leds). I originally used the mod operation on pixel_counter (commented in the code), but results were varying and wrong. Sometimes it looked the image wasn't written at all but the writing branch was entered (LED on debug_out(1)). Sometimes the main() froze on something. Sometimes it wrote just 256 or 512 or 4096 pixels (observed only via pixel_counter on 7 segs but not via screen). It's enough to uncomment line 66 and comment line 67 to unleash the madness. What could be the reason for this strange and unpredictable behaviour? Thank you.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity frame_writer is
port (
clk : in std_logic := '0'; -- clk.clk
reset : in std_logic := '0'; -- reset.reset
ctl_write : in std_logic := '0'; -- ctl.write
ctl_writedata : in std_logic_vector(31 downto 0) := (others => '0'); -- .writedata
wr_address : out std_logic_vector(31 downto 0); -- wr.address
wr_burstcount : out std_logic_vector(10 downto 0); -- .burstcount
wr_waitrequest : in std_logic := '0'; -- .waitrequest
wr_writedata : out std_logic_vector(31 downto 0); -- .writedata
wr_write : out std_logic; -- .write
debug_out : out std_logic_vector(127 downto 0); -- debug.debug_out
debug_in : in std_logic_vector(127 downto 0) := (others => '0') -- .debug_in
);
end entity frame_writer;
architecture rtl of frame_writer is
constant FRAME_SIZE: natural := 800 * 600;
signal pixel_counter: natural;
signal start_write: std_logic;
signal writeaddr: std_logic_vector(31 downto 0);
begin
wr_burstcount <= "00000000001";
debug_out(38 downto 20) <= std_logic_vector(to_unsigned(pixel_counter, 19));
process (clk, reset)
begin
if reset = '1' then
start_write <= '0';
pixel_counter <= 0;
debug_out(1 downto 0) <= (others => '0');
elsif rising_edge(clk) then
--if start_write = '0' and pixel_counter = 0 then
if start_write = '0' and (pixel_counter = 0 or pixel_counter >= FRAME_SIZE) then
wr_write <= '0';
pixel_counter <= 0;
wr_address <= (others => '0');
wr_writedata <= (others => '0');
if ctl_write = '1' then
start_write <= '1';
writeaddr <= ctl_writedata;
end if;
debug_out(0) <= '0';
else
wr_write <= '1';
wr_address <= std_logic_vector(unsigned(writeaddr) +
to_unsigned(pixel_counter, wr_address'length));
if pixel_counter < FRAME_SIZE/2 then
wr_writedata <= x"00000000";
else
wr_writedata <= x"ffffffff";
end if;
if wr_waitrequest = '0' then
start_write <= '0';
--pixel_counter <= (pixel_counter + 4) mod FRAME_SIZE;
pixel_counter <= pixel_counter + 4;
end if;
debug_out(0) <= '1';
debug_out(1) <= '1';
end if;
end if;
end process;
end architecture rtl;
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem with the mod or rem operators when they are not 2**N, is that it implements a divider. These have terrible timing performance in a single clock cycle (about 20MHz if you're lucky). So the fact that you're using a 167MHz clock probably meant it was basically producing random values. Do you have timing constraints for the design? did you look at them and see the failures?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- The problem with the mod or rem operators when they are not 2**N, is that it implements a divider. These have terrible timing performance in a single clock cycle (about 20MHz if you're lucky). So the fact that you're using a 167MHz clock probably meant it was basically producing random values. Do you have timing constraints for the design? did you look at them and see the failures? --- Quote End --- Tricky, thank you. I guess you're right. I haven't learned the timing analysis part of the design yet. I added the following *.sdc file to my project:
create_clock -name clock_50 -period 20
derive_pll_clocks
derive_clock_uncertainty
And here's the "red" TimeQuest report I get with the mod operation: https://docs.google.com/spreadsheets/d/1pumvhheg8nyqjznpagodadgxgcndvcolb7laxp3bmym/pubhtml I don't know how to interpret this yet but I guess that's you're talking about. Would you please briefly explain what this report say?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You asked for 20 ns clock period, but with a worst case slack of -24.5ns it means the data can arrive 24ns late (ie over an entire clock period). This analysis is the worst case (the design will be affected by temperature) but basically, its very bad. It means that the FMax you can use to guarantee data arrival before the clock is 20+24.5 ns = 44.5ns (about 22 MHz).
Dont use a mod operation for non 2**N values.- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page