- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to build a frame alignment component for an OTN (Optical Transport Network) with Quartus II on a Cyclone II FPGA and my problem is that I cannot reach the desired frequency.
Obviously I have to do some pipelining but I do not know what and where to pipeline. The code follows in VHDL. As you will see the code consists of two processes, a clocked process and a combinatorical one. Any kind of help would be mostly appreciated.
library ieee;use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
entity framer is
generic (DATA_WIDTH :integer := 256);
port (clk : in std_logic;
reset : in std_logic;
input : in std_logic_vector(255 downto 0);
output : out std_logic_vector(255 downto 0));
end framer;
architecture mixed of framer is
type state_type is (Frame_Search, Prove_In_Sync, In_Frame_Sync, Prove_Out_Sync);
signal next_state, current_state : state_type;
--signal cyclecounter : std_logic_vector(8 downto 0);
signal cyclecounter : integer range 510 downto 0;
signal cycle_inc : std_logic;
signal error_inc : std_logic;
--signal errorcount : std_logic_vector(3 downto 0);
signal errorcount : integer range 4 downto 0;
signal w_en : std_logic;
signal FAOOF : std_logic;
signal index, prev_index : natural;
signal input_reg1,input_reg2 : std_logic_vector(DATA_WIDTH-1 downto 0);
signal output_reg : std_logic_vector(DATA_WIDTH-1 downto 0);
signal joint : std_logic_vector(2*DATA_WIDTH-1 downto 0);
signal sync_regB : std_logic_vector(31 downto 0);
signal sync_regS : std_logic_vector(23 downto 0);
constant Fr_find : std_logic_vector(31 downto 0) := X"F6F6_2828";
constant Fr_lose : std_logic_vector(23 downto 0) := X"F62828";
begin
state_update : process (clk,reset)
begin
if reset = '1' then
current_state <= Frame_Search after 1 ns; --default state upon reset
cyclecounter <= 0 after 1 ns;
errorcount <= 0 after 1 ns;
elsif rising_edge(clk) then
current_state <= next_state; --update state
input_reg1 <= input after 1 ns;
input_reg2 <= input_reg1 after 1 ns;
prev_index <= index;
if w_en = '1' then
for i in joint'left-8 downto DATA_WIDTH loop
if i = index then
output_reg <= joint(i + 8 downto i + 8 - DATA_WIDTH + 1) after 1 ns;
end if;
end loop;
end if;
if FAOOF = '0' then
output <= output_reg;
else
output <= (others=>'0');
end if;
if cycle_inc = '1' then
cyclecounter <= cyclecounter + 1 after 1 ns;
else
cyclecounter <= 0 after 1 ns;
end if;
if cyclecounter = 509 then
if error_inc = '1' then
errorcount <= errorcount + 1 after 1 ns;
else
errorcount <= 0 after 1 ns;
end if;
end if;
end if;
end process;
joint <= input_reg2 & input_reg1;
state_machine : process (current_state, joint, output_reg, cycle_inc, error_inc, index, prev_index, cyclecounter, errorcount)
begin
case current_state is
when Frame_Search =>
FAOOF <= '1';
index <= 0;
cycle_inc <= '0';--counter reset
error_inc <= '0';--counter reset
w_en <= '0';
next_state <= current_state;
for j in joint'left-8 downto DATA_WIDTH loop
if joint(j downto j - (Fr_find'length-1)) = Fr_find then
w_en <= '1';
index <= j;
next_state <= Prove_In_Sync;
end if;
end loop;
when Prove_In_Sync =>
w_en <= '1';
FAOOF <= '1';
index <= prev_index;
cycle_inc <= '1';
error_inc <= '0';
next_state <= current_state;
if cyclecounter = 510 then
cycle_inc <= '0';--counter reset
--for k in joint'left-8 downto DATA_WIDTH loop
--if k = index then
--if joint(k downto k - (Fr_find'length-1)) = Fr_find then
if output_reg(output_reg'left - 8 downto output_reg'left - 8 - (Fr_find'length-1)) = Fr_find then
next_state <= In_Frame_Sync;
FAOOF <= '0';
--cycle_inc <= '0';--counter reset
--error_inc <= '0';--counter reset
else
next_state <= Frame_Search;
end if;
--end if;
--end loop;
else
cycle_inc <= '1';
end if;
when In_Frame_Sync =>
w_en <= '1';
FAOOF <= '0';
index <= prev_index;
cycle_inc <= '1';
error_inc <= '0';
next_state <= current_state;
if cyclecounter = 509 then
cycle_inc <= '0';--counter reset
--for l in joint'left-8 downto DATA_WIDTH loop
--if l = index then
--if joint(l - 8 downto (l - 8) - (Fr_lose'length - 1)) = Fr_lose then
if output_reg(output_reg'left - 16 downto output_reg'left - 16 - (Fr_lose'length-1)) = Fr_lose then
next_state <= current_state;
else
error_inc <= '1';
next_state <= Prove_Out_Sync;
end if;
--end if;
--end loop;
else
cycle_inc <= '1';
end if;
when Prove_Out_Sync=>
w_en <= '1';
FAOOF <= '0';
index <= prev_index;
cycle_inc <= '1';
error_inc <= '1';
next_state <= current_state;
if cyclecounter = 509 then
cycle_inc <= '0';--counter reset
--for m in joint'left-8 downto DATA_WIDTH loop
--if m = index then
--if joint(m - 8 downto (m - 8) - (Fr_lose'length - 1)) = Fr_lose then
if output_reg(output_reg'left - 16 downto output_reg'left- 16 - (Fr_lose'length-1)) = Fr_lose then
next_state <= In_Frame_Sync;
--cycle_inc <= '0';--counter reset
error_inc <= '0';--counter reset
else
error_inc <= '1';
if errorcount = 4 then
error_inc <= '0';--counter reset
next_state <= Frame_Search;
else
next_state <= current_state;
end if;
end if;
--end if;
--end loop;
else
cycle_inc <= '1';
end if;
end case;
end process;
end mixed;
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your second process is heavily combinatorial and is unlikely to pass timing. I personally use one process state machine always. That way I declare one state signal of type say s0,s1,s2,s3,s4, ...etc. and assign next state as follows: if...state <= s0, if... state <= s1 and so on. The change of state is always on the clock edge so are any conditions and assignments. To keep functionality you will need to redesign for latency effect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- Your second process is heavily combinatorial and is unlikely to pass timing. I personally use one process state machine always. That way I declare one state signal of type say s0,s1,s2,s3,s4, ...etc. and assign next state as follows: if...state <= s0, if... state <= s1 and so on. The change of state is always on the clock edge so are any conditions and assignments. To keep functionality you will need to redesign for latency effect. --- Quote End --- Using Timing Analysis I found that the critical path is inside the clock process where i store the result in the output register.
if w_en = '1' then
for i in joint'left-8 downto DATA_WIDTH loop
if i = index then
output_reg <= joint(i + 8 downto i + 8 - DATA_WIDTH + 1) after 1 ns;
end if;
end loop;
end if;
This design requires to have 2 processes at least to work properly and comply with the ITU-T G.709, G798 OTN specifications.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does the spec actually say in the specification "You MUST use two processes" or is it just that you cant get it to work with a single process state machine? I would be very surprised if it was the former.
The problem comes, I assume, because of the massive long combinatorial path of index, that forms part of the write-enable input on the output_reg register. index is muxed based on a comparator and then muxed again and anded with w_en. What you havent said is what the target FMax is and what the actual FMax is. You may have to just accept that your design isnt good enough or re-design it with more pipelining.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
kaz:
--- Quote Start --- Your second process is heavily combinatorial and is unlikely to pass timing. I personally use one process state machine always. That way I declare one state signal of type say s0,s1,s2,s3,s4, ...etc. and assign next state as follows: if...state <= s0, if... state <= s1 and so on. The change of state is always on the clock edge so are any conditions and assignments. To keep functionality you will need to redesign for latency effect. --- Quote End --- tricky: --- Quote Start --- Does the spec actually say in the specification "You MUST use two processes" or is it just that you cant get it to work with a single process state machine? I would be very surprised if it was the former. --- Quote End --- This shows a bias against two-process state-machines, claiming one-process states-machines are better which is without any proof or justification. A two-process state-machine does perform equally as a one-process state-machine. A two-process state machine allows you to generate additional combinatorial outputs, without having to resort to a additional combinatorial process to decode these outputs, or having to "redesign for latency effect". The correct advice is to look at the critical paths in TimeQuest, as hargeo effectively did, and rework the logic.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The idea of one or more process per se is irrelevant. You may use as many as you want. The issue is avoiding long comb paths.
The timing does not refer to process boundaries but to paths. In your case I realised "joint" is assigned to output_register from input through reg1 and reg2 directly and in this case yes you are right but it could be the long comb paths are contributing to routing difficulty. edit: note that output_register is assigned conditional on wr_en and FA00F which both come from depth of three levels of if statements in the second process. That is causing your violation I believe.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
By making a few changes like using std_logic_vectors with less bits where possible, prefering ranges that are multiple of 2 but still using 2 processes, I managed to increase the Fmax dramatically.
if w_en = '1' then
for step in 1 downto 0 loop
for i in DATA_WIDTH/2-1 downto 0 loop
if i = prev_index(6 downto 0) then
temp_reg1 <= joint(i + DATA_WIDTH downto i + 1) after 1 ns;
temp_reg2 <= joint(i + DATA_WIDTH*3/2 downto i + DATA_WIDTH/2 + 1) after 1 ns;
end if;
end loop;
end loop;
if prev_index(7) = '0' then
output_reg <= temp_reg1 after 1 ns;
else
output_reg <= temp_reg2 after 1 ns;
end if;
end if;
I hope that this pipelining is appropriate. Could anyone indicate a correction or improvement?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page