How to constrain a source-synchronous desing?

Altera_Forum · ‎12-13-2010

Hi All,

I am having problems setting up my sdc constraints for a source-synchronous interface that i have.

The design is described as follows.

--- The FPGA provide a reference clk(125Mhz) to a SERDES chip.

--- The SERDES chip ouputs a clk(62.5Mhz) and a databus(10-bit) to the FPGA.

--- The 10-bit data should be sampled at both the rising edge and falling edge of the 62.5Mhz clk in the FPGA.

My question is how to constraint set_input_delay in my design.

I upload the SERDES chip datasheet with this post.

Any and all help is very much appreciated

Harris

Altera_Forum · ‎12-13-2010

Haven't looked at the .zip, but does is the clock/data edge aligned or center-aligned coming from the AMCC chip? If it's edge-aligned, are you using a PLL to phase-shift the clock 90 degrees? (Constraints can change based on how it's implemented, naturally, so asking before writing out a quick example).

Altera_Forum · ‎12-14-2010

Thanks for your response.

Your answer is helpful for me.

Would you please explain what condition the edge aligned will be applied and what condition the center-aligned aligned will be applied?

Thanks

Harris

Altera_Forum · ‎12-14-2010

If edge aligned:

create_clock -period 16.0 -name ssync_clk [get_ports ssync_clk_in]

derive_pll_clocks

derive_clock_uncertainty

create_clock -period 16.0 -name ssync_ext

set_input_delay -clock ssync_ext -max 2.0 [get_ports {ssync_data[*]}

set_input_delay -clock ssync_ext -min -2.0 [get_ports {ssync_data[*]}

set_input_delay -clock ssync_ext -max 2.0 [get_ports {ssync_data[*]} -clock_fall -add_delay

set_input_delay -clock ssync_ext -min -2.0 [get_ports {ssync_data[*]} -clock_fall -add_delay

That assumes the PLL is shifting the clock 90 degrees to center it on the data. (It doesn't have to do that, but things get trickier). I put in delays of 2.0 and -2.0. Basically that says the upstream chip and board delays will skew the data by +/-2.0ns in relation to the clock. You need to change those to the correct value.

# #################

If the upstream device sends its data center-aligned, then change the virtual clock to be:

create_clock -period 16.0 -waveform {4.0 12.0} -name ssync_ext

All that says is the clock is phase-shifted 90 degrees externally, so the clock is coming centered on the data eye. Your PLL shouldn't do a phase-shift now.

(Oh, and when creating the PLL, make sure it is in source-synchronous compensation mode. At these speeds it probably doesn't matter, as you can make timing anyway, but it's good practice)

Altera_Forum · ‎12-14-2010

FYI, I always create a .tcl script called TQ_analysis that does something like:

report_timing -setup -npaths 100 -detail full_path -from [get_clocks ssync_ext] -panel_name "ssync_inputs||setup"

report_timing -hold -npaths 100 -detail full_path -from [get_clocks ssync_ext] -panel_name "ssync_inputs||hold"

I add this to the project directory and access it from the Scripts pull-down menu in TQ. It analyzes setup and hold on these input paths. You should see a 4ns setup relationship and a -4ns hold relationship. That means the transfer can handle +/-4ns of skew on the data. You'll then see 2ns iExt delays, whcih says +/-2ns of that relationship is chewed up by our external delays, leaving 2ns of skew for the FPGA. Again, you'll have to change the external values.

(Also, I've typed everything but not run it, so there probably will be typos)

Altera_Forum · ‎12-16-2010

Rysc,

Thanks for your reply.

I think I need some time to understand and try it.

Harris

Altera_Forum · ‎12-20-2010

My design outputs a 125Mhz Clk and 10-bit data to the downstream device.

The 125Mhz clock is inverted from a dedicated clock-pin.

The 10-bit data is launched at the rising edge of the original clock.

The downstream device requires a setup = 1.2ns and hold = 0.2ns.

My constrain command is as follows:

create_clock -period 8 -name clk [get_ports {clk_in}]

derive_pll_clocks

create_generated_clock -name clk_out -source [get_ports {clk}] -invert [get_ports {clk_out}]

set_output_delay -clock { clk_out } -max 2.0 [get_ports {da_t*}]

set_output_delay -clock { clk_out } -min -0.2 [get_ports {da_t*}]

The TQ report some SETUP SLACK.

If anybody can explain that for me?

Any and all help is very much appreciated

Harris

Altera_Forum · ‎12-20-2010

My undrstanding of output delays setting in TQ is:

Max = tSU + max data delay – min clk delay

Min = - tH + min data delay – max clk delay

Thus in your case it should be (ignoring board delays if clk delay = data delay):

max = 1.2

min = - .2

That gives a valid window of (your period - 1.2 - .2 = 6.6 ns) centred at 3.4ns from clk edge

Altera_Forum · ‎12-20-2010

When you say there is some setup slack, I assume you mean "negative slack", i.e. it's failing?

You have a 5ns setup relationship(10ns clk period with launch at 5ns and latch at 10ns), and an external delay of 2.0(I'm not sure how it went from 1.2 Tsu to 2.0, but that may be board delay, clock skew, margin, etc.). Anyway, that ends up with 3ns for the FPGA, which it should be able to meet pretty easily, since the clock and data delays shuld match.

Can you look at the data path going out(Data Arrival) and clock path going out(Data Required) and see if there is something physically different? The one thing I see commonly is that users use the dedicated PLL clock output, rather than just a regular I/O. That path doesn't use a global(would show up as type CLKCTRL in your data required path). Maybe post the timing report here.

Altera_Forum · ‎12-20-2010

--- Quote Start ---

You have a 5ns setup relationship(10ns clk period with launch at 5ns and latch at 10ns), and

--- Quote End ---

Can't understand what you say at all.

tSU is 1.2ns, tH is .2ns, clk period is 8ns, its SDR interface I presume. I don't know from where you got your figures to overwrite my posts.

Altera_Forum · ‎12-20-2010

Yep. Hadn't had my coffee. I think I read 10-bit data as 10ns period. Anyway, with an 8ns period, launch at 4ns and latch at 8ns, that's a 4ns setup relationship. The external delay in your constraint is 2ns, which leaves 2ns for the FPGA, i.e. the data could be up to 2ns longer to get out of the FPGA than the clock and still meet setup timing. If matched, I believe this should be do-able. Would have to see the timing analysis to know why.

Altera_Forum · ‎12-20-2010

The issue is not just 8 ns you read it as 10ns. but how do you divide time as launch 4ns, latch 4 ns and on what basis. Apologies but it sounds to me meaningless.

Altera_Forum · ‎12-20-2010

It's an 8ns clock, and he said he inverts the clock going off chip. I assume the launch clock is not inverted. So if you draw the launch and latch clocks, they're both 8ns, but the launch is rising at 0ns, while the latch is rising at time 4ns. So it's really launch at time 0ns and latch at time 4ns, but it still results in a 4ns setup relationship.

Altera_Forum · ‎12-20-2010

I don't agree with that. Inverted or not, it is the output data clk and all delays are related to it.

if internal data clk is inverted out and as as long as the delay are related to output clk, the same argument below applies:

The data must not violate the receiving device's tSU of 1.2ns hence it can be delayed as maximum as that from next clk edge.

it must not violate tH of .2ns i.e. not be that early at the edge. So it is a straightforward SDR synchronous IF having 6.6ns valid window.

--tH--|------- valid window ------tSU|

Altera_Forum · ‎12-20-2010

Inverted or not makes a big difference. As an aside, I noticed the generated clock assignment on the output port has the -invert option. That is only needed if TimeQuest doesn't know there's an inversion going on inside the FPGA. For example, if you do a 180 degree phase shift with the PLL output driving the clock out, or just add an inverter in the path, it knows it is being inverted. The only time I have used the -invert option is when I drive a clock out through an altddio function, but tie the high register to '0' and the low register to '1', which inverts the clock in a way that TimeQuest/Quartus can't recognize.

-----

kaz, I'm not following your last post. I agree with your analysis of the data window, but if the clock being sent out with the data is inverted or not makes a big difference. If it's not inverted, then there is an 8ns setup relationship. If it is inverted, then it is 4ns. This makes a big difference on how it's analyzed and what Quartus must do to meet timing. You've shown Th--valid--tSu in relation to two edges of the latch clock, but there is no reference point. If it's not inverted, those latch edges are at 0, 8, 16, but if it's inverted then those edges are at 4, 12, etc.

Altera_Forum · ‎12-20-2010

If data clk is inverted or rotated by PLL to any degree, it is then only a helping hand for timing closure to achieve the requirements.

The requirement is tSU/tH non violation at external device with respect to received data clk.

By entering max delay you inform TQ of tSU and min delay for tH.

TQ does not see but delay figures irrespective what it means to user.

TQ is told about these values with respect to data clk and this is what is required.

TQ knows (or should know) what is clking the data registers at FPGA (io registers or so).

We are talking about pin perspective. Any violation at io register is a different matter (end of fpga chain). if it does occur then PLL or inversion may help.

Altera_Forum · ‎12-20-2010

Rephrasing, to make sure I have it right, but you're saying the external delay should be the same regardless of whether they invert the clock or not. I completely agree. But he said he's failing timing(I think that was said). The analysis of that is partially "are my constraints correct" and if they are, then "what's going on inside the FPGA that can't meet my constraints". It's the latter part I'm moving on to, i.e. I believe the constraints are right(except maybe that -invert) and I want to see what's going on in the FPGA that causes it to fail timing.

Altera_Forum · ‎12-20-2010

I believe by your words "external delay" you meant (set output delay figures) .

I don't see anything wrong with the output constraints of -.2 & 2.0 instead of -.2 & 1.2 since the valid window is further narrowed to lower the skew.

What I explained was my personal understanding of skew relation to the valid window. The valid window is the primary requirement and skew comes secondary as a subset of that window.

Altera_Forum · ‎12-21-2010

My design is source-synchronous. According to Altera's input&output delay calculating method, the output -max should be Tsu.

The output -min should be -Th. How to deal with the skew?

I think I can not understand the concept of the data valid window correctly.

I upload a mini version of my project. This mini version have the same problem to my project.

Would you please take a look on this version.

Thanks

Harris

Altera_Forum · ‎12-21-2010

We must think of the system centric approach i.e. we need to get timing right for external device. The entry of max/min delay conveys that info to TQ.

Regarding skew: first notice that altera examples in their resource centre are actually targeting skew at FPGA boundary i.e. fpga as a chip and not in any system which is of no use to you and probably to anybody unless you want to sell your fpga on its own.

To control skew within the sytem centric approach, you must first not violate external device but may be you need to exagerate the tSU/tH in order to minimise that wide valid window entered with honest figures of device.

Altera_Forum · ‎12-21-2010

The more I see TimeQuest examples or documentations, the more I feel there are errors and editorial bugs.

I just came across this example:

http://www.altera.com/support/examples/timequest/exm-tq-basic-source-sync.html

According to this example: set_output_delay is the same as the traditional setting of tCO at pins (data transition with respect to output clock pin).

In fact, this is much easier since I can calculate max/min from tSU/tH of device as we did in the classic timing analyser for years

The controversy is this: how on earth can delay be tCO in this example but

tSU (for max) or -tH (for min) in the equations. Surely that is impossible to comprehend.

For example, in this post, tSU is 1.2n, clk period is 8 ns, hence data can be delayed as much as tCO of 8 - 1.2 = 6.8 ns and this should be maximum delay between output clk and its data transition (assuming zero board delay diff of clk and data) so that it does not violate tSU at latching edge.

while data can be as early as tCO of .2ns to avoid tH violation at external device and this should the minimum.

Accordingly, the equations should be (max = UI - tSU, min = +tH)

instead of:

max = tSU

min = - tH

Any comments welcome.