Hi,
I am currently trying to implement Color Constancy algorithm for image processing. The algorithm requires at least one division operation (NUMER: 40 bits and DENOM: 40 bits). After scouting relevant posts, it seems that LPM_DIVIDE megafunction suits my requirement. However, I am not entirely sure in implementing the divide function inside my block as well as when to expect the correct output (Note: I am very new to VHDL and FPGAs). Kindly advise on any corrections required. With my code, I obtained these errors using LPM_DIVIDE: ** Error: /home/robocup/Desktop/Old Board Apply Color Constancy/ColorConstancy.vhd(213): Illegal sequential statement. ** Error: /home/robocup/Desktop/Old Board Apply Color Constancy/ColorConstancy.vhd(231): VHDL Compiler exiting Also, it is very likely that 40bits/40bits result might not be obtained within a single clock - which is fine as long as I can implement a checker. Any suggestion on this part? Vincentlibrary ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
library lpm;
use lpm.lpm_components.all;
-- This entity is to calculate normaliser at each frame.
entity ColorConstancy is
generic(
DATA_WIDTH : integer := 10;
COORD_WIDTH : integer := 10;
RGB_WIDTH : integer := 40
);
port
(
-- Clock signal
CLK : in std_logic;
-- Input signals
R_IN : in unsigned (DATA_WIDTH-1 downto 0);
G_IN : in unsigned (DATA_WIDTH-1 downto 0);
B_IN : in unsigned (DATA_WIDTH-1 downto 0);
DVAL : in std_logic;
FVAL : in std_logic;
-- Output signals
R_DYN : out unsigned (RGB_WIDTH-1 downto 0);
G_DYN : out unsigned (RGB_WIDTH-1 downto 0);
B_DYN : out unsigned (RGB_WIDTH-1 downto 0);
NORM : out unsigned (RGB_WIDTH-1 downto 0)
);
end entity;
architecture rt1 of ColorConstancy is
component LPM_DIVIDE
generic ( LPM_WIDTHN : natural; -- MUST be greater than 0
LPM_WIDTHD : natural; -- MUST be greater than 0
LPM_NREPRESENTATION : string := "UNSIGNED";
LPM_DREPRESENTATION : string := "UNSIGNED";
LPM_PIPELINE : natural := 0;
LPM_TYPE : string := L_DIVIDE;
LPM_HINT : string := "UNUSED"
);
port (NUMER : in std_logic_vector(LPM_WIDTHN-1 downto 0);
DENOM : in std_logic_vector(LPM_WIDTHD-1 downto 0);
ACLR : in std_logic := '0';
CLOCK : in std_logic := '0';
CLKEN : in std_logic := '1';
QUOTIENT : out std_logic_vector(LPM_WIDTHN-1 downto 0);
REMAIN : out std_logic_vector(LPM_WIDTHD-1 downto 0)
);
end component LPM_DIVIDE;
-- Various states
type state_t is (
first_frame,
wait_for_new_frame,
wait_end_of_frame,
calc_norm,
calc_new_RGB,
calc_RGB_dyn
);
-- Current state
signal state : state_t;
-- RGB sums
signal R_sum, G_sum, B_sum : unsigned (RGB_WIDTH-1 downto 0);
signal R_sum_temp, G_sum_temp, B_sum_temp : unsigned (RGB_WIDTH-1 downto 0);
-- RGB normaliser
signal normaliser : unsigned (RGB_WIDTH-1 downto 0);
-- Colour Constancy variables
signal R_thresh : unsigned (RGB_WIDTH-1 downto 0);
signal G_thresh : unsigned (RGB_WIDTH-1 downto 0);
signal B_thresh : unsigned (RGB_WIDTH-1 downto 0);
-- LPM DIVIDER
signal R_dyn_thresh : std_logic_vector (RGB_WIDTH-1 downto 0);
begin
process (CLK)
BEGIN
if (rising_edge(CLK)) then
if (DVAL = '1' and FVAL = '1') then
-- Next pixel, same frame
R_sum <= ( R_sum + resize(R_IN, RGB_width) );
G_sum <= ( G_sum + resize(G_IN, RGB_width) );
B_sum <= ( B_sum + resize(B_IN, RGB_width) );
end if;
case state is
when first_frame =>
if (FVAL = '1') then
state <= wait_for_new_frame;
end if;
-- Initialise
R_sum <= to_unsigned(0, RGB_width);
G_sum <= to_unsigned(0, RGB_width);
B_sum <= to_unsigned(0, RGB_width);
R_sum_temp <= to_unsigned(0, RGB_width);
G_sum_temp <= to_unsigned(0, RGB_width);
B_sum_temp <= to_unsigned(0, RGB_width);
normaliser <= to_unsigned(1, RGB_width);
-- Orange Thresholding
R_thresh <= to_unsigned(230, RGB_width);
G_thresh <= to_unsigned(128, RGB_width);
B_thresh <= to_unsigned(23, RGB_width);
-- Wait for new frame
when wait_for_new_frame =>
if (FVAL = '1') then
state <= wait_end_of_frame;
end if;
-- Wait until end of frame
when wait_end_of_frame =>
if (FVAL = '0') then
-- End of frame, start normaliser calculation
state <= calc_norm;
-- Save the current RGB sum
R_sum_temp <= R_sum;
G_sum_temp <= G_sum;
B_sum_temp <= B_sum;
-- Reset RGB sum counter for new frame
R_sum <= to_unsigned(0, RGB_width);
G_sum <= to_unsigned(0, RGB_width);
B_sum <= to_unsigned(0, RGB_width);
end if;
-- Calculate Normaliser
when calc_norm =>
state <= calc_new_RGB;
-- TO BE OPTIMISED
-- Approximate normaliser = biggest element + medium element / 2
if (R_sum_temp > G_sum_temp and R_sum_temp > B_sum_temp) then
if (G_sum_temp > B_sum_temp) then
normaliser <= R_sum_temp + (G_sum_temp srl 1);
else
normaliser <= R_sum_temp + (B_sum_temp srl 1);
end if;
elsif (G_sum_temp > R_sum_temp and G_sum_temp > B_sum_temp) then
if (R_sum_temp > B_sum_temp) then
normaliser <= G_sum_temp + (R_sum_temp srl 1);
else
normaliser <= G_sum_temp + (B_sum_temp srl 1);
end if;
else
if(R_sum_temp > G_sum_temp) then
normaliser <= B_sum_temp + (R_sum_temp srl 1);
else
normaliser <= B_sum_temp + (G_sum_temp srl 1);
end if;
end if;
-- Adjust RGB sums to grey world assumption
-- 1) Approximate sqrt(3), by multiplying with 55 (0011 0111).
-- 2) To be divided by 32 at next state.
R_sum_temp <= ((R_sum_temp sll 5) + (R_sum_temp sll 4) + (R_sum_temp sll 2) + (R_sum_temp sll 1) + R_sum_temp);
G_sum_temp <= ((G_sum_temp sll 5) + (G_sum_temp sll 4) + (G_sum_temp sll 2) + (G_sum_temp sll 1) + G_sum_temp);
B_sum_temp <= ((B_sum_temp sll 5) + (B_sum_temp sll 4) + (B_sum_temp sll 2) + (B_sum_temp sll 1) + B_sum_temp);
when calc_new_RGB =>
state <= calc_RGB_dyn;
--Calculate Dynamic RGB and divide by 32
-- R_start = 230 (1110 0110).
R_thresh <= (((R_sum_temp sll 7) + (R_sum_temp sll 6) + (R_sum_temp sll 5) + (R_sum_temp sll 2) + (R_sum_temp sll 1)) srl 5);
-- G_start = 128 (1000 0000).
G_thresh <= ((G_sum_temp sll 7) srl 5);
-- B_start = 23 (0001 0111).
B_thresh <= (((B_sum_temp sll 4) + (B_sum_temp sll 2) + (B_sum_temp sll 1) + B_sum_temp) srl 5);
when calc_RGB_dyn =>
state <= wait_for_new_frame;
------ DIVISION REQUIRED HERE FOR R_thresh, G_thresh and B_thresh -----
div_component: LPM_DIVIDE
generic map(
LPM_WIDTHN => 40,
LPM_WIDTHD => 40,
LPM_NREPRESENTATION =>"UNSIGNED",
LPM_DREPRESENTATION =>"UNSIGNED",
LPM_PIPELINE => 0,
LPM_TYPE =>L_DIVIDE,
LPM_HINT =>"UNUSED"
)
port map (
NUMER => std_logic_vector(R_thresh),
DENOM => std_logic_vector(normaliser),
ACLR => '0',
CLOCK => CLK,
CLKEN => '1',
QUOTIENT => R_dyn_thresh,
REMAIN => open
);
end case;
end if;
end process;
R_DYN <= R_thresh;
G_DYN <= G_thresh;
B_DYN <= B_thresh;
NORM <= normaliser;
end rt1;
链接已复制
5 回复数
I see two aspects of your question
- the VHDL syntax part - speed (the real problem) The first point is rather trivial. Component instantiations are only allowed in concurrent code, not in sequential blocks. I don't apply to retell the basic VHDL concepts in this post, you should review the topic in your favourite VHDL text book. But it's a more formal point because you can place the divider outside the block and "connect" it through signals. Synchronizing a piplined divider needs to be considered as additional problem, but is basically possible. For signed and unsigned types, inference of hardware dividers from a "/" division operator is also supported by the compiler. But you have only limited options to control pipelined operation, so it may be better to refer to explicite MegaFunction instantiation. If pipeline operation is necessary is mainly a matter of your clock speed. Timing analysis will answer the question. A more general question is, if you actually need a divider for your design?Hi FvM,
Thanks for the reply. Originally I planned to separate division operation by feeding the block with numerator and denominator signals. However, I would not be able to (at least) simulate using ModelSim and check the logics. Therefore, I attempted to include the lpm divide function inside the block. With our current setup, division is necessary. It is required because the results (RGB ) have to be passed to another block, RGBtoHSV. Luckily, this operation needs to be done only once per frame (for RGB ) which allows some clock cycles to be spent. As per now I am testing with a basic "/" division operator, which as you pointed you is supported by the compiler (and replaced with lpm divider). From the compiler message, it seems that the numerator and denominator are assigned with 10 and 8 bits respectively. Also, I'll try your suggestion to tackle the syntax and define the lpm divider parameters as necessary. Vincent.Hi all,
A quick update: I am now able to implement my algorithm and it is working fine. However, the amount of resources being used increased by 25% due to (inferred) division megafunction with compilation time of 8 minutes <- used to be 2.5 minutes. Is this expected? Can we optimise this? Compiler by default setup LPM_DIVIDE parameter (as expected) to be: Info (12134): Parameter "LPM_WIDTHN" = "40" Info (12134): Parameter "LPM_WIDTHD" = "40" Info (12134): Parameter "LPM_NREPRESENTATION" = "UNSIGNED" Info (12134): Parameter "LPM_DREPRESENTATION" = "UNSIGNED" Attached is code snippet for division, where all signals are 40 bits.
when calc_RGB_dyn =>
state <= wait_for_new_frame;
-- Calculate dynamic RGB = normalise new RGB
R_dyn_thresh <= ((R_thresh) / (normaliser));
G_dyn_thresh <= ((G_thresh) / (normaliser));
B_dyn_thresh <= ((B_thresh) / (normaliser));
end case;
Vincent
Hi Tricky,
Thanks for the reply. As speed is not much of an issue, I implemented a divider using subtractions. Something like this:
if ( numerator >= denominator) then
numerator <= numerator - denominator;
counter <= counter + 1;
else
output <= counter;
end if;
This brought down the resources usage from 25% to 5%. Perhaps this is not the best solution, I am open for better ideas. Vincent
