Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
17256 Discussions

Stratix III FPGA: Problem using Asynchronous FIFOs

Altera_Forum
Honored Contributor II
6,892 Views

Hi, 

 

 

I’m using a Stratix III FPGA and I have made a design using two interconnected Asynchronous FIFOs and when I launch the Post-Place and route simulation (using Modelsim) it doesn’t work for high frequencies. However, when I compile it with Quartus, the calculated frequency is three times higher. Why can it happen? 

 

 

The configuration of the FIFOS is: 

 

  • Two clocks 

  • Asynchronous clear 

  • Width: 128 bits 

  • Depth: 4 words 

  • Show ahead synchronous FIFO mode 

 

In attached files you can find the vhdl files. 

 

Thank you in advance.
0 Kudos
37 Replies
Altera_Forum
Honored Contributor II
2,639 Views

how fast are we talking, and what fmax did Timequest tell you was the max speed? 

 

The failure might be occuring accross the clock domain boundaries? IIRC, you have to tell timequest about these links yourself.
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

The frequencies I want to reach are: 200 MHz (CLK) and 280 MHz (CLK2). Fmax from Timequest are: 359.07 MHz (CLK) and 429.74 MHz (CLK2). I use a SDC file in Quartus where I set the time constrains: 4 ns for CLK and 3 ns for CLK2. 

 

create_clock -period 4 -name clk [get_ports clk] 

create_clock -period 3 -name clk2 [get_ports clk2] 

derive_clock_uncertainty
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

Question: what is the phenomenon you notice in ModelSim to say it is not working at high speed? 

Note that the FMax reported by Quartus applies for the internal register to register paths. The external input to register and register to output paths are not included, as they are governed by set_input_delay and set_output_delay SDC constraints. 

Did you add a set_clock_groups to cut timing between the two clocks?
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

I'm using I/O files to test my design.First of all, Output Data aren't the same than Input Data. When I check the internal signals I see some control signals work wrong. Write and Read signals assert before Full and Empty ones (you can see it in the attached figure). 

 

How are set_input_delay and set_output_delay used? What do you mean with set_clock_groups?
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

data_in probably has a set-up or hold violation, which explains the different output data. 

 

For the additional SDC constraints you'd probably better read the manuals on TimeQuest. 

 

There is another possible pitfall, I remember my colleague telling me that a dcfifo of depth 4 doesn't work well, hence he always uses a minimum depth of 8, I do too.
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

I change the depth of the FIFOs to 8 words and the design works fine with Fast simulation and 0 ºC, but it doesn't work fine with Slow simulation and 85 ºC.

0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

I've written a SDC file, I've compiled the design and it doesn't meet the timing constraints. I set the input and the output delays to zero. Have I written well the file? 

 

# ******************************************************* 

# Created Clocks 

# ******************************************************* 

 

create_clock -name clk -period 5 [get_ports clk] 

create_clock -name clk2 -period 3.5 [get_ports clk2] 

derive_clock_uncertainty 

 

# ******************************************************* 

# Created Generated Clocks 

# ******************************************************* 

 

derive_pll_clocks 

 

# ******************************************************* 

# Set Clock Groups 

# ******************************************************* 

 

set_clock_groups -asynchronous -group {clk} -group {clk2} 

 

# ******************************************************* 

# Set False Paths 

# ******************************************************* 

 

set_false_path -from [get_ports rst] 

 

# ******************************************************* 

# Input Delays 

# ******************************************************* 

 

set_input_delay -clock clk -max 0.0 [get_ports data_in[*]] 

set_input_delay -clock clk -min 0.0 [get_ports data_in[*]] 

 

set_input_delay -clock clk -max 0.0 [get_ports full_in] 

set_input_delay -clock clk -min 0.0 [get_ports full_in] 

 

set_input_delay -clock clk -max 0.0 [get_ports empty_in] 

set_input_delay -clock clk -min 0.0 [get_ports empty_in] 

 

 

# ******************************************************* 

# Output Delays 

# ******************************************************* 

 

set_output_delay -clock clk -max 0.0 [get_ports wr[0]] 

set_output_delay -clock clk -min 0.0 [get_ports wr[0]] 

set_output_delay -clock clk2 -max 0.0 [get_ports wr[1]] 

set_output_delay -clock clk2 -min 0.0 [get_ports wr[1]] 

 

set_output_delay -clock clk2 -max 0.0 [get_ports rd[0]] 

set_output_delay -clock clk2 -min 0.0 [get_ports rd[0]] 

set_output_delay -clock clk -max 0.0 [get_ports rd[1]] 

set_output_delay -clock clk -min 0.0 [get_ports rd[1]] 

 

set_output_delay -clock clk -max 0.0 [get_ports full[0]] 

set_output_delay -clock clk -min 0.0 [get_ports full[0]] 

set_output_delay -clock clk2 -max 0.0 [get_ports full[1]] 

set_output_delay -clock clk2 -min 0.0 [get_ports full[1]] 

 

set_output_delay -clock clk2 -max 0.0 [get_ports empty[0]] 

set_output_delay -clock clk2 -min 0.0 [get_ports empty[0]] 

set_output_delay -clock clk -max 0.0 [get_ports empty[1]] 

set_output_delay -clock clk -min 0.0 [get_ports empty[1]] 

 

set_output_delay -clock clk2 -max 0.0 [get_ports data_mid[*]] 

set_output_delay -clock clk2 -min 0.0 [get_ports data_mid[*]] 

 

set_output_delay -clock clk -max 0.0 [get_ports data_out[*]] 

set_output_delay -clock clk -min 0.0 [get_ports data_out[*]]
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

Looks quite OK to me. 

You could do without the derive_pll_clocks, I believe you do not have a PLL in your design? 

You can group pins together, so you have less typing to do: { wr[0] rd[0] full[0] ... } and the [get_ports ... ] is not necessary. 

 

I usually set some small values for the input_delays, say -min 1.000 ns and -max 2.00 ns. 

If it is a submodule I'm testing I don't care about the output delays and set them all as false-paths 

 

What timing errors do you get?
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

I got setup errors in both clocks, but when I've removed the set_output_delay commands they have disappeared. Why does it happen? 

 

I have got this frequencies: 224.52 MHz (CLK) and 378.36 MHz (CLK2). They seem too low for a design made from two FIFOs. What is the maximum frequency for the DCFIFO? 

 

I generate the clocks using a testbench, have I to treat them as virtual clocks when I use the set_input_delay and set_output_delay commands?
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

 

--- Quote Start ---  

I got setup errors in both clocks, but when I've removed the set_output_delay commands they have disappeared. Why does it happen? 

 

--- Quote End ---  

 

 

Because it could not meet the requested output timing, i.e. it could not guarantee proper setup times for the external devices.  

 

--- Quote Start ---  

 

I have got this frequencies: 224.52 MHz (CLK) and 378.36 MHz (CLK2). They seem too low for a design made from two FIFOs. What is the maximum frequency for the DCFIFO? 

 

--- Quote End ---  

 

CLK has to deal with input signals, so it will have to factor in the input delays. Clk2 only deals with internal signals (as the output has a false path). Hence the difference bewteen the 2 clocks. 

 

--- Quote Start ---  

 

I generate the clocks using a testbench, have I to treat them as virtual clocks when I use the set_input_delay and set_output_delay commands? 

--- Quote End ---  

 

No they are real clocks, they just happen to be generated by a testbench.
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

I get those frequencies using TimeQuest Analyser from Quartus, but when I try to test my design using modelsim (with the .vho and .sdo files) it doesn't work for those frequencies. So, can the frequencies be wrong? 

 

Referring to the sdc file, how can I set the output delays?
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

 

--- Quote Start ---  

I get those frequencies using TimeQuest Analyser from Quartus, but when I try to test my design using modelsim (with the .vho and .sdo files) it doesn't work for those frequencies. So, can the frequencies be wrong? 

 

--- Quote End ---  

 

 

No the frequencies are correct. But you need to carefully generate your input signals, i.o.w. they must respect proper setup times as reported byt the Datasheet Report in TimeQuest. 

Now I try to avoid Gate Level Simulation in Timequest. A: because of the input setup requirements, the output delays. B: you can not probe every signal as the ones optimized away are niot there anymore. Furthermore the ones you can watched are 'real time' and are shifted in regards with the input clocks, so if you are debugging a design it gets difficult to see cause and action. And C: it is painfully slow compared to RTL simulation. Of course if you are designa a physical module like a Memory PHY you utlimnately have to run Gate Level Simulation, although TimeQuest can tell you all about it. 

 

--- Quote Start ---  

 

Referring to the sdc file, how can I set the output delays? 

--- Quote End ---  

 

You could read Rysc's Timequest doc, you can find it on the wiki:timequest user guide (http://www.alterawiki.com/wiki/timequest_user_guide)
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

My design works in functional simulation, so I'm trying to test it in Gate Level simulation before I introduce it into the board. But it works only in the best case (Fast and 0 ºC) and fails in the others, so it isn't going to work in the FPGA.

0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

 

--- Quote Start ---  

My design works in functional simulation, so I'm trying to test it in Gate Level simulation before I introduce it into the board. But it works only in the best case (Fast and 0 ºC) and fails in the others, so it isn't going to work in the FPGA. 

--- Quote End ---  

 

 

I think it is in the testbench. I have a different style and I don't see immediately how the input signals behave. Maybe you can send be a .qar(with the inputdata.txt file)? I'll take a deeper look then.
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

Ok, I send the files to you. Thank you very much.

0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

Dear Drumont, 

when there's a timing violation, Modelsim puts a yellow marker in the the waveform window and prints a warning message to the log. 

Which register is suffering the timing violation? 

 

Regarding I/O constraints: to get you started with simulation setting the max & min delays to zero is good enough. 

Of course, you need to make sure that it's true. That is, the inputs your testbench provides to your UUT must be aligned with with clock your testbench provides to the UUT. 

 

If I read the .VHD correctly, you're using a FIFO with a 5 stage synchrnonizer, correct? 

IIRC, DCFIFO actually has two different use cases 

a) 1 synchronizer state, for when the write and read clock are related 

b) 2 or more synchrnonizer stages, for unrelated clocks 

 

If you go with option "b", the DCFIFO function automatically sets false path exceptions on some of it's internal clock crossing logic -- it's not possible to do timing analysis on unrelated clocks anyway. 

Thus, it's possible to get a green light from TimeQuest and see failures in gate level simulation. 

 

If your use case is actually of related write and read clocks, then just go with option "A" and set the number of stages to 1. 

 

OTHO if your use case of unrelated write and read clocks, then you just have to bite the bullet: timing violations are unavoidable. Your FPGA will misbehave on occasion -- but the more synchronizer stages, the better. 

 

Regarding gate level simulation, if you're getting timing violations on synchronization registers, you'll just have to work around it and disable timing checks for those registers using the tcheck_set command. 

Important note: make sure you understand what you're doing. You should only need to disable timing checks in the first synchronization registers. If you see timing checks elsewhere, then it's another problem 

 

I suggest you read Altera's "Understanding Metastability in FPGAs" too.
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

There are seven timing violations. This events are produced by both FIFOs. I can attach some parts of the log where you can see the events 

 

 

SETUP Low VIOLATION ON aclr WITH RESPECT TO clk; 

 

/tb_fifo/UUT/\fifo_cmp|dcfifo_component|auto_generated|fifo_ram|ram_block9a108\/addr_b_register 

 

SETUP Low VIOLATION ON aclr WITH RESPECT TO clk; 

 

/tb_fifo/UUT/\fifo_cmp|dcfifo_component|auto_generated|fifo_ram|ram_block9a72\/addr_b_register 

 

SETUP Low VIOLATION ON aclr WITH RESPECT TO clk; 

 

/tb_fifo/UUT/\fifo_cmp|dcfifo_component|auto_generated|fifo_ram|ram_block9a0\/addr_b_register 

 

SETUP Low VIOLATION ON aclr WITH RESPECT TO clk; 

 

/tb_fifo/UUT/\fifo_cmp|dcfifo_component|auto_generated|fifo_ram|ram_block9a36\/addr_b_register 

 

HOLD High VIOLATION ON DATAIN WITH RESPECT TO CLK 

 

/tb_fifo/UUT/\fifo_cmp|dcfifo_component|auto_generated|rs_dgwp|dffpipe10|dffe11a[1] 

 

SETUP High VIOLATION ON ASDATA WITH RESPECT TO CLK; 

 

/tb_fifo/UUT/\fifo2_cmp|dcfifo_component|auto_generated|ws_dgrp|dffpipe14|dffe15a[0] 

 

SETUP High VIOLATION ON ASDATA WITH RESPECT TO CLK; 

 

/tb_fifo/UUT/\fifo2_cmp|dcfifo_component|auto_generated|ws_dgrp|dffpipe14|dffe15a[3]
0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

Yes, I’m using two Asynchronous FIFOs with a 5 stage synchronizer, so my option is the b one. Why isn’t it possible to do timing analysis on unrelated clocks? Does it mean I can't do a correct Gate Level Simulation?

0 Kudos
Altera_Forum
Honored Contributor II
2,639 Views

It's not possible to perform timing analysis on unrelated clocks because.. they're unrelated: they may drift, they may have unknown phase relations. 

 

Static timing analysis needs to know the timing of the edges of the clocks in respect to one another. 

 

Case 1: consider CLK1 and CLK2 which are generared from 2 different 10 MHz crystals. Since the crystals are never perfect, one may be 10.00001 MHz while the other may be 9.999999 MHz. 

This means the clock edges will drift one in relation to another. 

You can't do timing analysis on transfers between these two clocks. 

 

Case 2: consider CLK1 and CLK2 which are generated from the same crystal but because CLK2 passes through a unknown length of cable and thus you can't know the phase of CLK1 in relation to CLK2. 

Again, you can't do timing analysis on transfers between these two clocks. 

 

Case 3: consider CLK1 and CLK2 which are generated from the same crystal, and go through some propagation chain BUT in this case you know the phase of CLK1 in respect to CLK2 at the FPGA's input pins, with a small error. 

Now you have two related clocks and transfers between these two clocks can be analysed -- you can use a set_clock_uncertainty constraint to handle the error. 

In this situation, you should go with option "A". 

 

If you do have unrelated clocks, then your only option for a gate level simulation is to disable timing checks on the synchronization registers.
0 Kudos
Altera_Forum
Honored Contributor II
2,590 Views

PS: You can ignore those SETUP violation of ACLR in respect to CLK, as long as the reset pulse is longer than 1 clock. 

 

It's the other ones which are hurting you. 

You need to look down into the RTL or post mapping schematics, find out which are the first level of synchronization registers and then disable timing checks on them. 

 

I think it will be something like  

tcheck_set /tb_fifo/UUT/\fifo_cmp|dcfifo_component|auto_generated|rs_dgwp OFF 

 

But I stress again, it's important that you know what you're doing and only disable checks for the first level of synchronization registers. 

If you disable timing checks for other registers, you may just end up hiding other bugs that may exist.
0 Kudos
Reply