Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20705 Discussions

Arria 10 SGMII CDR does not lock to data

abuerkle
Beginner
1,043 Views

Hi

 

I adapted the SGMII reference design from Rocketboards to our own hardware based on a Arria 10 SX SoC (https://www.rocketboards.org/foswiki/Documentation/A10SGMIIRDUserManualLTS). I use a 100MHz clock connected to CLKUSR pin for transceiver calibration and a 125MHz clock as transceiver reference clock.

 

Transmitting data seems to be working fine but I see a lot of packet loss on the receiving data. I checked a few signals with chipscope and noticed that the transceiver reset PHY controller is periodically asserting rx_digitalreset. The reason of that is probably the toggling rx_is_lockedtodata signal. I measured the signal with chipscope and with an oscilloscope. The period of the rx_is_lockedtodata signal changes depending on if the Ethernet PHY is powered down (1kHz) or operating (2kHz), but despite that it is constant.

chipscope.jpg

Since receiving data is working, but with about 50% packet loss. I assume the CDR is not able to sucessfully lock to the incoming data. Is my assumption correct or is something else causing the rx_is_lockedtodata toggling with a constant rate? Why does the period of rx_is_lockedtodata change depending on if a SGMII RX signal is present or not? How can I verify transceiver calibration was successful? Are there any possibilities to debug the CDR? 

 

Thank you in advance for the help,

Andreas

0 Kudos
9 Replies
EBERLAZARE_I_Intel
1,030 Views

Hi,


I need a few days to check on the issue that you are seeing.


Also, are you able to see the same issue on Arria 10 SoC dev kit?


0 Kudos
abuerkle
Beginner
1,009 Views

Thank you for your response. Unfortunately, I don't have access to an Arria 10 SoC dev kit.

0 Kudos
EBERLAZARE_I_Intel
976 Views

Hi,


We are checking with the internal team working on Transceiver. Our expert will get to you soon.


Thanks for your patience!


0 Kudos
skbeh
Employee
933 Views

Hi Andreas


When rx_is_lockedtodata asserted this would indicates that the CDR PLL is locked to the incoming data rx_serial_data.

As I understood from the problem statement, you observed rx_is_lockedtodata and rx_digitalreset toggling periodically.


1) Does this issue occur when using Arria 10 SX SoC dev kit (in case you have used it), or this only occur in your own hardware?

To narrow down the scope of issue, the suggestion is to perform PHY serial loopback enabled to test the PCS and embedded PMA functions. This is to check if the PCS and embedded PMA could work correctly across multiple reset. To enable serial PHY loopback, set the loopback bit in the PCS control register to 1. //Refer Table 13. TX PMA Optional Ports

//https://www.intel.com/content/www/us/en/docs/programmable/683617/21-1/transceiver-phy-overview.html

See 4.2.9. PHY Loopback of TSE user guide


2) To verify if transceiver calibration was successful, monitor the rx_cal_busy & rx_cal_busy. 

Calibration is complete when *_cal_busy is deasserted.

rx_cal_busy: When asserted, indicates RX channel is being calibrated.

tx_cal_busy: When asserted, indicates TX channel is being calibrated. 


3) Please also check if your design has any timing violation.


4) Pls try enable 'Use separate TX/RX reset' in the Transceiver PHY Reset Controller IP. If not set, all channels will be reset once loss lock.

See Table 250. General Options of the Arria 10 Transceiver PHY User Guide for the definition.

https://www.intel.com/content/www/us/en/docs/programmable/683617/21-1/transceiver-phy-overview.html


0 Kudos
abuerkle
Beginner
921 Views

Hi

 

1) Unfortunately, I haven't access to an Arria 10 SX SoC dev kit. When I enable loopback in the PCS, rx_is_lockedtodata remains asserted and the transceiver is not reset anymore.


2) rx_cal_busy and tx_cal_busy are both '0' and never change (checked with signaltap after programming).

 

3) No timing errors are reported by the Quartus timing analyzer. No unconstrained clock is reported. I was using the same constraints as in the Rocketboards design.

 

4) I observed no difference with separate reset enabled. I have a separate Transceiver PHY reset controller instance for each of the two SGMII channels. I guess this is only relevant if one instance is used for more than one channel.

 

Thank you for your help and best regards,

Andreas

0 Kudos
skbeh
Employee
909 Views

Hi Andreas

1) Signaltap the PHY rx_is_lockedtoref signal, to see when rx_is_lockedtodata is low, whether the RX CDR still can lock to CDR reference clock or not. If rx_is_lockedtoref still can lock to CDR reference clock, then probably the data is missing somewhere.


2) Check if the actual frequency on the board and PHY match the IP parameters being set when generated.


3) Found below note regarding the 'Manual' option in Reset Controller on page 255 of A10 Transceiver PHY User Guide. 

Try set the RX/TX digital reset mode to Manual.

https://www.intel.com/content/www/us/en/docs/programmable/683617/21-1/how-to-implement-pci-express-pipe-in.html

"(2).a. If you are using the Transceiver PHY Reset Controller, you must configure the TX digital reset mode and RX digital reset mode to Manual to avoid resetting the Auto Speed Negotiation (ASN) block which handles the rate switch whenever the channel PCS is reset.

b. When the TX digitalreset is in Auto mode, the associated tx_digitalreset controller automatically resets whenever the pll_locked signal is deasserted. When in Manual mode, the associated tx_digitalreset controller is not reset when the pll_locked signal is deasserted, allowing the user to choose what to do.

c. When the RX digitalreset is in Auto mode, the associated rx_digitalreset controllerautomatically resets whenever the rx_is_lockedtodata signal is deasserted. When in Manual mode, the associated rx_digitalreset controller is not reset when the rx_is_lockedtodata signal is deasserted, allowing the user to choose what to do.

d. If the resets are configured to Auto mode for PIPE designs, then the digital reset will get asserted automatically when the lock signal is deasserted."


0 Kudos
abuerkle
Beginner
880 Views

Hi, thank you for your support.

 

1) rx_is_lockedtoref is always '1' when rx_is_lockedtodata is '0'. When rx_is_lockedtodata is '1', rx_is_lockedtoref is toggling.

 

2) I measured the frequency at the reference clock input and it matches the 125MHz that are requested by the PCS.

 

3) Thank you for pointing to that. I guess that could be an issue once the CDR locks to the data.

 

While I measured the reference clock, I noticed that I don't get any RX data (100% packet loss instead of ~50%) when the oscilloscope probe is attached to the hardware. The rx_is_lockedtoref signal does not change, but it seems to have an impact on the received data.

I use an LVDS reference clock connected to a dedicated reference clock input pin. Do I need to configure an on-chip termination or is that enabled automatically?

 

What is required that the rx_is_lockedtodata is asserted? Is it depending on the transition of the input data?

 

Best regards,

Andreas

0 Kudos
skbeh
Employee
872 Views

I see that you have a question regarding termination for the dedicated GXB reference clock.

The GXB reference clock do indeed provide internal differential terminations and should be AC coupled. 

The only IO standard that should be DC-coupled with external terminations is HCSL (HCSL standard is used for PCIE interface).

The proper .qsf assignment to define the clock termination is "XCVR_A10_REFCLK_TERM_TRISTATE" and is shown in Section 8.9.1 of the A10 Transceiver PHY User Guide:

https://www.intel.com/content/www/us/en/docs/programmable/683617/

Example syntax:

set_instance_assignment -name XCVR_A10_REFCLK_TERM_TRISTATE TRISTATE_OFF -to <dedicated refclk pin name>


rx_is_lockedtodata status output port indicates that the RX CDR is currently in lock to data mode or is attempting to lock to the incoming data stream.


0 Kudos
abuerkle
Beginner
798 Views

Hi,

 

The problem with the CDR lock was solved by adding load capacitors to the crystal providing the reference clock for the SGMII PHY. Probably the frequency of the reference clock was slightly off. The remaining problem is that I see packets with a wrong CRC on the TX side. RX seems to work fine without any issues. I'm not sure if this is still caused by the reference clock.

The number of erroneous packets decrease when I increase the clock frequency of the 125MHz reference clock for the Arria 10 transceiver by a few ppm. But with +40ppm increased frequency, I again see the issue with the nonlocking CDR. According to documentation, the CDR should lock when the data frequency is in range +/- 1000ppm. Or is that setting different in the TSE MAC?

 

Thank you and best regards,

Andreas

 

 

0 Kudos
Reply