- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is best practice to reach the 500ps Data to Clock output Skew of RGMII specification v2.0 on a Cyclone V?
Trying synthesizing a TSE MAC with RX-Clock to TX-Clock loopback fails because of timing missmatch. Applying a 2ns delay at the PHY and with the constrains made, the FPGA has 600ps setup and 900ps hold time budget for TX-path. The worst failing path is from the back-looped RX-Clock Pin to one of the TX-Data output Pins with a mismatch of 360ps.
For more timing details please see my other post: Re: Cyclone V TSE MAC timing closure - Intel Community
The Cyclone V device datasheet states in Table 48 RGMII Timing Characteristics: Td (TX_CLK to TXD/TX_CTL output data delay) -0.85ns ..+0.15ns. So with the correct Clock to Data Delay it should be possible to reach the 500ps output Skew.
Any help on this would be greatly appreciated!
Jodok
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jodok,
Check this KDB link https://www.intel.com/content/www/us/en/support/programmable/articles/000079123.html, it's recommended to select the PHY with ability to adjust its input timing.
Thanks,
Regards,
Sheng
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sheng,
thank you for your response!
How does one then respect the "asymmetrical" characteristic of the Delay introduced by the PHY?
Could you please review the following constraints for a ddr transmission.
max_delay = tData(max) + tSU - (tPeriod/2 - tPHYDELAY)
min_delay = tData(min) - tH - (tPeriod/2 - tPHYDELAY)
Thank you very mouch.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I think should be no problem with your ddr transmission constraint based on this link https://www.intel.com/content/www/us/en/support/programmable/support-resources/design-examples/horizontal/exm-tse-rgmii-phy.html
I think the ddr is center aligned (tPeriod/2).
(tPeriod/2 - tPHYDELAY) reflects the shifted clock edge to ensure the setup and hold times are correctly met despite the delay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Sheng,
thank you for reviewing the constraint.
Indeed, tPeriod/2 represents the 90° Phase-Shift to change from edge-alinged Tx at MAC to center-alinged Rx at PHY.
Is it then so that the Quartus Synthesis-Tool is not able to fit the logic with according internal delay to fix timing without additional PHY-Delay? Is it really the Td (TX_CLK to TXD/TX_CTL output data delay) of -0.85ns ..+0.15ns that Quartus can't deal with?
I haven't fully figured it out yet, but your answers give me hope that I am close to the solution.
If you could spend a moment to have a look at the calculations I have attached to this post and could give me a quick response if the method used is reasonable, I really would appriciate it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Seems like that's the phy delay lacking which causes the violation.
Is the timing pass after adding the phy delay?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Sheng,
No, unfortunatly the timing is still bad.
Please, where do you think is the phy-delay lacking?
I thought I considered the phy-delay in the constraints with (tPeriod/2 - tPHYDELAY) Please have a look at Cell O20;O21 in the Excel.
Greetings
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I think you can refer the timing constraints in AN477: Designing RGMII Interface with FPGA and HardCopy Devices (page 12) https://www.intel.com/content/www/us/en/content-details/654563/an-477-designing-rgmii-interface-with-fpga-and-hardcopy-devices.html
The internal PHY delay is considered in tco for set_input_delay
For the equation in this link https://www.intel.com/content/www/us/en/support/programmable/support-resources/design-examples/horizontal/exm-tse-rgmii-phy.html, $data_delay_min, $data_delay_max, $clk_delay_min, $clk_delay_max put 0 as assume trace delay, pin capacitance, and rise/fall time differences between data and clock are negligible
Design .qar can be found here https://blog.csdn.net/wangyanchao151/article/details/90401027
Let me know if any further update or concern.
Thanks,
Regards,
Sheng
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Sheng,
I am not sure if we are on the same page anymore.
My interest is to reach the timing for MII-Tx, so i am not interested in input_delay constraints.
Fact is, if I constrain tx_max_delay and tx_min_delay as following and use a PHY-Delay of 2ns, Timing Analyzer gives me negative slack.
tx_max_delay = tDataTrace(max) + tSU = 0.33ns+1.05ns = 1.38ns
tx_min_delay = tDataTrace(min) + tHold = -0.33ns-0.8ns = -1.13ns
With this constraints made, I should meet the +/- 0.5ns "data to clock output skew" spiecified in RGMII v2.0
But this is not the case, so I wonder where does this negative slack come from.
So please, is Td (TS_CLK to TXD/TX_CTL output data delay) -0.85ns to +0.15ns in Table 48 in the CycloneV Datasheet of relevance in that case? What does it mean? Other sources tell me that it should be irrelevant since the synthesis tool should best match the internal skew of the fpga.
Thank you for your response.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Important to note, the MAC is not on a SOC, It is not HPS.
As I understand now, Td (TS_CLK to TXD/TX_CTL output data delay) -0.85ns to +0.15ns in Table 48 in the CycloneV Datasheet is only of relevance for SOC, HPS EMACs.
So my last question: is a skew on a synthesized mac with ddio-buffer on both data and clock of 1.5ns usual?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
May I know you're using External PHY Device with the Delay Option Is Enabled right?
Based on the AN477 page 12 design example, could you try with the 3 combinations below:
Assume trace delay, pin capacitance, and rise/fall time differences between the data and clock are negligible.
Combination 1:
tx_max_delay = tDataTrace(max) + tSU = 0ns+1.05ns = 1.05ns
tx_min_delay = tDataTrace(min) + tHold = 0ns-0.8ns = -0.8ns
Combination 2:
tx_max_delay = tDataTrace(max) + (-tSU) = 0ns+(-1.05ns) = -1.05ns
tx_min_delay = tDataTrace(min) + tHold = 0ns-0.8ns = -0.8ns
Combination 3:
tx_max_delay = tDataTrace(max) + (-tSU) = 0.33ns+(-1.05ns) = -0.72ns
tx_min_delay = tDataTrace(min) + tHold = -0.33ns-0.8ns = -1.13ns
I think Td (TS_CLK to TXD/TX_CTL output data delay) -0.85ns to +0.15ns is not related to rgmii as design example didn't include that as well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sheng
Yes, I still use a PHY with delay option of 2ns enabled.
Combination 1 seems the correct one to me. This is what I was doing originally.
But still, with exact this combination TA fails with -0.065ns max slack -40°..+100°C both, fast and slow model analyzed.
Combination 2 seems to be ilegitime, since tx_min_delay has to be smaller than tx_max_delay.
Combination 3 would theoretically be legal since tx_min_delay is smaller than tx_max_delay, but since the formula is the same as for Combination 2, it is false as well.
I will try to find other resources to get on.
Thank you Sheng.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Even without the tracedelay, the setup still fail. Could you try with PHY with delay option disabled, does the timing pass?
Seems like there's something wrong with the parameter used.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sheng,
Correct, even without the tracedelay (Combination 1), the setup fails.
"Could you try with PHY with delay option disabled, does the timing pass?"
But I do need the PHY-delay to shift the data from edge- to center-aligned. Without, timing will fail massivley.
...and did faild massively.
Again, what jitter at the Cyclone V is to be expected for such a combination of both, data and clk coming from ddio-buffer?
Could you please share with me what jitter is to expect.
Does the fpga maybe struggle to delay the data singals internally positive or negative to the clock signal?
Could I help the synthesis tool by shifting the external PHY-delay away from symmetric (2ns) towards a asymmetrical delay for example (+/- 1.5ns). But this is basically what I tried to achive before and failed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So here is what I belive to found out so far:
Asymmetric Delays for Rx/Tx Path seem to be benefitial. I only can guess that is because the fpga can not synthesize negative delay on data trace. This I belive is because in my design the Rx-clk is looped-back as Tx-clk and can not be delayed freely.
create_clock -period 8.000 -name PHY0_RX_CLK -waveform {1.750 6.250} [get_ports PHY0_RX_CLK]
create_clock -period 8.000 -name PHY0_RX_CLK_VIRT
create_generated_clock -name PHY0_TX_CLK_VIRT -phase 100.0 -source [get_ports PHY0_RX_CLK] [get_ports PHY0_TX_CLK]
In this example both Tx and Rx have an external PHY (clk)-Delay of 2.25ns so the data at Rx will arrive in principle to soon. The fpga can now add delay to the Rx-data trace. The PHY delays the clock with 2.25ns on the Tx Path so that in general the data occure to soon. The fpga can now add delay to the Tx-data trace.
With the following constraints, a mildly ok result can be achived.
# ************************************************************** |
# External components |
# ************************************************************** |
# TVX0106 |
set tSKQ 0.1 |
# DP83867 |
set TsetupT 0.55 |
set TholdT -0.55 |
set TsetupR 1.05 |
set TholdR -0.8 |
# Calculate min / max skew |
set RxMaxDelay [expr {$TsetupT + $tSKQ}] |
set RxMinDelay [expr {$TholdT - $tSKQ}] |
set TxMaxDelay [expr {$TsetupR + $tSKQ}] |
set TxMinDelay [expr {$TholdR - $tSKQ}] |
TA Slack: | [ns] |
Input to Register Setup | 0.244 |
Input to Register Hold | 0.165 |
Input to Outupt Setup | -0.045 |
Input to Output Hold | -0.084 |
If a symmetric delay of 2ns for both Tx and Rx Path is used, the following Slack is much worse:
create_clock -period 8.000 -name PHY0_RX_CLK -waveform {2.000 6.000} [get_ports PHY0_RX_CLK]
create_clock -period 8.000 -name PHY0_RX_CLK_VIRT
create_generated_clock -name PHY0_TX_CLK_VIRT -phase 90.0 -source [get_ports PHY0_RX_CLK] [get_ports PHY0_TX_CLK]
TA Slack: | [ns] |
Input to Register Setup | 521 |
Input to Register Hold | 0 |
Input to Outupt Setup | -0.15 |
Input to Output Hold | -0.175 |
To improove timing further I now try to manipulate D5 Delay which is impressively bad documented.
Could somebody please verify my findings.
What else can be done on that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Seems like there's no problem with the timing equation. Most probably the problem due to some parameter is wrong or missing. I think you may need to open an IPS thread https://www.intel.com/content/www/us/en/support/articles/000057045/ethernet-products.html to get some insights from RGMII expert.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page