Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16666 Discussions

ModelSim simulation result is different to run in real FPGA

Altera_Forum
Honored Contributor II
3,225 Views

We use an EP1C6T144 chip as an SPI master to receive data from ADCs. The FPGA state machine is successful and receives data correctly in ModelSim simulation, but always shorts 1 bit shift if run on real FPGA chiip. The simulated waves and timing are exactly same to ADCs specification and also same to we observed real signals in scope. We are able to narrow down the problem to a paragraph of VHDL code: 

 

---- 

when st_spi_read_bit => 

 

for i in 0 to miso'length - 1 loop -- miso'length is the number of SPI channels 

memory(i) <= memory(i)(SPI_BITS - 2 downto 0) & miso(i); 

end loop; 

 

-- We used another counter here to record how many times shifted -- 

 

if (TO_INTEGER(UNSIGNED(cnt_spibits)) = SPI_BITS - 1) then 

 

-- clear and go to next state 

 

else 

 

cnt_spibits <= cnt_spibits + 1; 

 

end if; 

---- 

 

At state st_spi_read_bit, miso bit will be shifted into memory. The SPI_BITS says 32, so it reads and shifts 32 times. We used another counter to record how many times the data are shifted, and confirmed it shifted 32 times in the read FPGA chip. But even it has 32 times, the result is still short a bit shift and data remains 1/2. The problem seems able to narrow down to this line, which requires to do 32 times in ModelSim, 33 times in real FGPA for correct data. 

 

---- 

memory(i) <= memory(i)(SPI_BITS - 2 downto 0) & miso(i); 

---- 

 

Does ModelSim has different interpretation of above line to Quartus? 

 

Thanks for all opinions!
0 Kudos
8 Replies
Altera_Forum
Honored Contributor II
1,641 Views

The problem will be with your testbench or VHDL code. You didnt post the full code, so I cannot really see it in context. Poor VHDL can result in simulation synthesis missmatch. But assuming good VHDL, then the problem will be with the testbench setup.

0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views

Are you using multiple clocks? Eg., is your shift-register shifting based on the SPI clock or an FPGA clock? 

 

If the FPGA is acting as an SPI slave, and the SPI clock is much slower than the FPGA clock (or an FPGA clock generated by a PLL), then you would typically route the SPI clock through a synchronizer (dual DFF), and then through edge-detection logic, for SPI rising-edges and SPI falling-edges. Depending on the device you are communicating with, you would typically shift in on the rising-edge, and out on the falling-edge. This does not require two shift-registers, one will do; shift in on the rising-edge pulse, and update an output register on the falling-edge. 

 

If you have not implemented synchronization correctly, then you will see a simulation vs hardware mismatch. 

 

You will also see a mismatch if you have poor signals on your hardware, eg. an SPI clock routed to multiple devices can have reflections which cause the clock-edge to be non-monotonic. If your FPGA was acting as an SPI slave, with the SPI clock as an input, a small glitch on the clock can cause the FPGA (which is fast) to think it is seeing two clock edges.  

 

Note that you can use SignalTap II to look at the input signals inside the FPGA (though your device may be too old to support SignalTap II ....) 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views

Thank you for help! 

Following are the VHDL code of the FPGA machine and the test bench. Sorry if it looks like a mess. It is our first try to use FPGA. We have not much experience in VHDL, Quartus, and ModelSim. The SignalTap II sounds useful, we will learn it. 

 

The state machine : adc01.vhd 

---- 

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; use ieee.numeric_std.all; entity adc01 is port ( addr : in std_logic_vector (6 downto 0); -- address bus, 7-bit data : inout std_logic_vector (12 downto 0) := (others => 'Z'); -- data bus, 13-bit, init to 'Z' or fail to feed data in modelsim rd : in std_logic; -- read me, active low wr : in std_logic; -- write me, active low bs : in std_logic; -- board select, active low -- hwst : in std_logic; -- hardware start command -- convst : out std_logic := '0'; -- conversion start -- stby : out std_logic := '1'; -- standby mode, active low reset : out std_logic := '0'; -- reset ADCs, active high,pulse width > 50 ns sclk : out std_logic := '1'; -- SPI SCLK fs : out std_logic := '1'; -- SPI SS, active low, ADCs' frame synchronization miso : in std_logic_vector (63 downto 0); -- SPI MISO busy : in std_logic; -- DAC busy, active high -- clk : in std_logic -- system clock ); end adc01; architecture behavior of adc01 is -- Timing -- Symbols refer to ADS85x8 datasheet (oct 2011) page 9. -- XXX - The timing values are set according system clock period 40 ns (25 MHz) constant T_RESET : integer := 2; -- reset pulse width > 50 ns -- constant T_TDCVB : integer := 15; -- CONVST_x high to BUSY delay, 25 ns max -- XXX - Caution : The tDCVB will be 400+ ns if the ASLEEP pin is pulled high! -- XXX - If the period of system clock (clk) is short than TDMSB (12 ns) or TPPDO (17 ns), -- XXX - or the 1/2 SCLK period 23/2 ns, then T_TPPDO > 0. -- XXX - The serial clock period must > 22 ns, and is obtained from 2 times of system clock. -- XXX - TDMSB : FS low to MSB valid delay, 12 ns max -- XXX - TPDDO : SCLK falling edge to new data valid propagation delay, 17 ns max -- constant T_TPPDO : integer := 2; -- Data bits -- constant SPI_BITS : integer := 32; -- serial data bits -- Timing counters -- signal cnt_reset : std_logic_vector (1 downto 0) := (others => '0'); -- Reset pulse width -- signal cnt_tdcvb : std_logic_vector (3 downto 0) := (others => '0'); -- TDCVB timing -- signal cnt_tppdo : std_logic_vector (1 downto 0) := (others => '0'); -- TPPDO timing -- Data bits counters -- signal cnt_spibits : std_logic_vector (5 downto 0) := (others => '0'); -- SPI data 32 bits -- This bit indicates the busy status -- 0 : busy, is converting signals and collecting data -- 1 : data are available for read signal data_readability : std_logic := '1'; constant STATUSBIT_DRABLTY : integer := 12; -- bit 12, data readability -- constant ADDR_CMD : integer := 0; -- The command register -- Command -- constant CMD_CNVST : integer := 1; -- Start conversion constant CMD_RESET : integer := 8191; -- Reset ADCs, for 13-bit data bus -- State machine -- type state_type is ( st_idle, st_tdcvb, st_busy, st_spi_read_bit, st_spi_clear, st_reset ); -- XXX - If the state machine is not stable for one-hot mode,try enum it -- attribute ENUM_ENCODING : string; -- attribute ENUM_ENCODING of state_type : type is "000 001 010 011 100 101"; signal state : state_type := st_idle; signal wr_last : std_logic := '1'; -- value of the wr signal in the last clock state signal hwst_last : std_logic := '1'; -- value of the hwst signal in the last clock state -- Internal buffer -- type memory_type is array (0 to 63) of std_logic_vector (SPI_BITS - 1 downto 0); signal memory : memory_type; signal clk_reg : std_logic := '1'; -- Regulated system clock signal sclk_enable : std_logic := '0'; signal addr_reg : std_logic_vector (6 downto 0); signal bs_reg : std_logic := '1'; signal rd_reg : std_logic := '1'; signal wr_reg : std_logic := '1'; signal hwst_reg : std_logic := '1'; begin -- System Clock Regulating system_clock : process (clk) begin clk_reg <= not clk; end process system_clock; -- SPI Clock spi_clock: process (clk_reg, sclk_enable) begin if (sclk_enable = '0') then sclk <= '1'; else sclk <= clk_reg; end if; end process spi_clock; -- Synchronize signals synch_signals : process (clk_reg) begin if (clk_reg'event and clk_reg = '1') then addr_reg <= addr; bs_reg <= bs; rd_reg <= rd; wr_reg <= wr; hwst_reg <= hwst; end if; end process synch_signals; -- Read state_read: process (bs_reg, rd_reg) variable var_index : integer; variable var_data : std_logic_vector (12 downto 0); begin if (bs_reg = '0' and state = st_idle) then if (wr_reg = '1') then -- no writing if (rd_reg = '0') then -- start of read cycle var_index := TO_INTEGER(UNSIGNED(addr_reg) srl 1); if addr_reg(0) = '0' then var_data := memory(var_index)(28 downto 16); else var_data := memory(var_index)(12 downto 0); end if; --if TO_INTEGER(UNSIGNED(addr_reg)) = ADDR_STATUS then -- data <= (STATUSBIT_DRABLTY => data_readability, others => '0'); var_data (STATUSBIT_DRABLTY) := data_readability; --end if; data <= var_data; elsif (rd_reg = '1') then -- end of read cycle data <= (others => 'Z'); end if; end if; end if; end process state_read; -- Clocked state machine state_clocked: process (clk_reg) variable command : integer; begin if (clk_reg'event and clk_reg = '1') then wr_last <= wr_reg; -- this will happen in next cycle hwst_last <= hwst_reg; if (bs_reg = '0' and wr_reg = '0' and wr_last = '1') then command := TO_INTEGER(UNSIGNED(data(12 downto 0))); case command is when CMD_CNVST => if (state = st_idle) then convst <= '1'; data_readability <= '0'; -- cnt_tdcvb <= cnt_tdcvb + 1; state <= st_tdcvb; end if; when CMD_RESET => -- data <= (others => 'Z'); convst <= '0'; -- stby <= '1'; fs <= '1'; sclk_enable <= '0'; cnt_spibits <= (others => '0'); -- cnt_tdcvb <= (others => '0'); cnt_reset <= (others => '0'); reset <= '1'; cnt_reset <= cnt_reset + 1; state <= st_reset; when others => -- do nothing -- end case; else case state is when st_idle => -- do nothing but hardware start -- if (hwst_reg = '0' and hwst_last = '1') then convst <= '1'; data_readability <= '0'; state <= st_tdcvb; end if; when st_tdcvb => -- if (TO_INTEGER(UNSIGNED(cnt_tdcvb)) < T_TDCVB) then -- cnt_tdcvb <= cnt_tdcvb + 1; -- else -- cnt_tdcvb <= (others => '0'); convst <= '0'; state <= st_busy; -- end if; when st_busy => if (busy = '0') then fs <= '0'; -- XXX - The system clock period is longer than TDMSB (12 ns) -- XXX - so we can go to st_spi_read directly. state <= st_spi_read_bit; end if; when st_spi_read_bit => -- shift data in -- for i in 0 to miso'length - 1 loop memory(i) <= memory(i)(SPI_BITS - 2 downto 0) & miso(i); end loop; if (TO_INTEGER(UNSIGNED(cnt_spibits)) = SPI_BITS - 1) then fs <= '1'; sclk_enable <= '0'; data_readability <= '1'; cnt_spibits <= (others => '0'); state <= st_idle; else cnt_spibits <= cnt_spibits + 1; if (TO_INTEGER(UNSIGNED(cnt_spibits)) = 0) then sclk_enable <= '1'; end if; end if; when st_reset => if (TO_INTEGER(UNSIGNED(cnt_reset)) < T_RESET) then cnt_reset <= cnt_reset + 1; else reset <= '0'; cnt_reset <= (others => '0'); state <= st_idle; wr_last <= '1'; end if; when others => state <= st_idle; end case; end if; end if; -- clk_reg'event end process state_clocked; end behavior; 

---- 

End of adc01.vhd
0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views

The test bench : adc01_tb.vhd 

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; use ieee.numeric_std.all; entity adc01_tb is end adc01_tb; architecture behavior of adc01_tb is component adc01 port ( addr : in std_logic_vector (6 downto 0); -- address bus, 7-bit data : inout std_logic_vector (12 downto 0); -- data bus rd : in std_logic; -- read me, active low wr : in std_logic; -- write me, active low bs : in std_logic; -- board select, active low -- hwst : in std_logic; -- hardware start command -- convst : out std_logic; -- conversion start -- stby : out std_logic; -- standby mode, active low reset : out std_logic; -- reset ADCs, active high,pulse width > 50 ns sclk : out std_logic; -- SPI SCLK fs : out std_logic; -- SPI SS, active low, ADCs' frame synchronization miso : in std_logic_vector (63 downto 0); -- SPI MISO busy : in std_logic; -- DAC busy, active high -- clk : in std_logic -- system clock ); end component; -- signal addr : std_logic_vector (7 downto 0) := (others => '0'); signal addr : std_logic_vector (6 downto 0); signal data : std_logic_vector (12 downto 0); signal rd : std_logic := '1'; signal wr : std_logic := '1'; signal bs : std_logic := '1'; -- signal hwst : std_logic := '1'; -- signal convst : std_logic; -- signal stby : std_logic; signal reset : std_logic; signal sclk : std_logic; signal fs : std_logic; signal miso : std_logic_vector (63 downto 0) := (others => '0'); signal busy : std_logic := '0'; -- signal clk : std_logic := '0'; signal test_data_1 : std_logic_vector (31 downto 0) := x"040003ff"; -- Special purpose Addresses -- constant ADDR_STATUS : integer := 0; -- constant ADDR_CMD : integer := 0; -- The command register -- Command -- constant CMD_CNVST : integer := 1; -- Start conversion constant CMD_RESET : integer := 8191; -- Reset ADCs, for 12-bit data bus constant CLK_PERIOD : time := 40 ns; -- 25 MHz clock begin uut : adc01 port map ( addr => addr, data => data, rd => rd, wr => wr, bs => bs, -- hwst => hwst, -- convst => convst, -- stby => stby, reset => reset, sclk => sclk, fs => fs, miso => miso, busy => busy, -- clk => clk ); clock : process begin clk <= '0'; wait for CLK_PERIOD/2; clk <= '1'; wait for CLK_PERIOD/2; end process; stimulus : process begin -- Reset -- wait for 11 ns; -- addr <= std_logic_vector(to_unsigned(ADDR_CMD, addr'length)); data <= std_logic_vector(to_unsigned(CMD_RESET, data'length)); wait for 13 ns; bs <= '0'; wr <= '0'; wait for 50 ns; wr <= '1'; bs <= '1'; wait for 100 ns; -- Start convert data <= std_logic_vector(to_unsigned(CMD_CNVST, data'length)); wait for 13 ns; bs <= '0'; wr <= '0'; wait until convst='1'; wait for 25 ns; -- -+ busy <= '1'; -- | wait for 50 ns; -- -+- The max convertion time is 1.33 us busy <= '0'; wr <= '1'; bs <= '1'; wait for 100 ns; -- Reset wait for 200 ns; data <= std_logic_vector(to_unsigned(CMD_RESET, data'length)); wait for 50 ns; bs <= '0'; wr <= '0'; wait for 50 ns; wr <= '1'; bs <= '1'; wait for 100 ns; -- Start convert data <= std_logic_vector(to_unsigned(CMD_CNVST, data'length)); wait for 13 ns; bs <= '0'; wr <= '0'; -- or hardware start --hwst <= '0'; --wait for 50 ns; -- -+ wait until convst='1'; wait for 25 ns; -- -+ busy <= '1'; -- | wait for 25 ns; -- | wr <= '1'; -- | --hwst <= '1'; -- | wait for 1280 ns; -- -+- The max convertion time is 1.33 us busy <= '0'; wait until fs='0'; wait for 12 ns; --TDMSB for i in 0 to miso'length -1 loop miso(i) <= test_data_1(31); end loop; test_data_1 <= test_data_1(30 downto 0) & '0'; for j in 0 to 30 loop wait until sclk='0'; wait for 17 ns; -- TPDDO for i in 0 to miso'length -1 loop miso(i) <= test_data_1(31); end loop; test_data_1 <= test_data_1(30 downto 0) & '0'; wait until sclk='1'; end loop; wait for 100 ns; -- Read data data <= (others => 'Z'); addr <= std_logic_vector(to_unsigned(ADDR_STATUS, addr'length)); wait for 7 ns; bs <= '0'; rd <= '0'; wait for 50 ns; rd <= '1'; wait for 50 ns; for i in 0 to 127 loop addr <= std_logic_vector(to_unsigned(i, addr'length)); wait for 8 ns; rd <= '0'; wait for 50 ns; rd <= '1'; wait for 50 ns; end loop; wait for 200 ns; bs <= '1'; wait for 2 us; end process; end;
0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views
0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views

Well, for a start, I would ditch the clk_reg signal all together. Its going to make things break. Just clock everything with the system clock. 

The state_read process is also missing a lot of signals from the sensitivity list, so this will cause a sim/synth missmatch as the sensitivity list is ignored for synth, and logic created from it. ALL signals read in the process should be in the list (you are missing state, wr_reg, addr_reg, memory). Also, as data and the vars are not set in ALL states (ie, you have no else cases) you will create latches, which cause problems for timing. 

 

Why have you set the sclk to clk_reg? what is the system clock speed? what is the SPI clock rate?
0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views

The ADC's SPI data are first driven by the falling edge of the FS signal, then each falling edges of SCLK which speed is equal to clk_reg. That is, the ADC sends out data on falling edga of clk_reg, and the state machine reads data on rising edge of clk_reg. The system clock (clk_reg) has period of 40 ns (25 MHz). Therefore, there are 20 ns for the ADC new data delay (tPPDO 17 ns max), and > 20 ns For the first bit delay (tDMSB 12 ns max). 

 

The ADC SPI timing diagram: 

https://www.alteraforum.com/forum/attachment.php?attachmentid=7748  

 

When read data from PC IO (state_read), We need to check signals only when bs_reg and rd_reg are both low, so we need to watch and let only bs_reg and rd_reg in the sensitivity list. We also tried to disable the state_read, let result of read SPI data output to data bus, and confirmed the state_read does not affect the problem of data shift. 

 

Most of the latches left in the state_machine are consider as required. However we continue to review those code which maybe hide problem. 

 

Thank you for help!
0 Kudos
Altera_Forum
Honored Contributor II
1,641 Views

You forget one important thing - synthesis ignores sensitivity lists. So if you miss signals out you only cause the simulation to be wrong compared to the hardware.

0 Kudos
Reply