Transceiver Rate Match FIFO - Why is it needed?

Altera_Forum · ‎10-16-2012

Hi,

I've noticed protocols like PCIe include a feature where special "skip" codes are inserted into the data stream so that the receiver can drop or repeat these codes to match the received data rate. This is used to compensate for ppm differences in the transmitter and receiver reference clocks. The Rate Match FIFO in the Altera transceivers is used for this purpose.

But why is this scheme needed at all? The receiver's PLL locks to the incoming data rate so why does it matter if there are ppm differences in the reference clocks?

Perhaps it is desired to separate the clock for the downstream processing logic from the deserializer logic. In this case, why not run the downstream logic at a fixed clock rate that's guaranteed to be faster than the input data rate?

Thanks

Altera_Forum · ‎10-17-2012

Some long time ago I designed the front end of a receiver. This is the most difficult part of any communication system. The front end requires two separate functions; one to recover symbol clock and one to track the centre frequency of residual carrier left over and these two functions are related.

The clock recovery is based on a timing error detector(TED) algorithm specific to type of symbols. The TED filtered output is used then in two ways. Either to directly change phase of NCO at ADC and hence track the transmitter clock and this method is easier and does not require any rate matching. Or TED output is used in a fully digital way where ADC sampling is left free running on a rate close to nominal Tx clock. Then a fractional interpolator is used to delay or advance the signal in reverse of error using a fixed system clock close to nominal Tx rate. In this fully digital design the error can go either way. The error is called "Mu" and ranges from 0 ~ +/-1 samples.

If Mu is heading towards +1 then the interpolator heads towards -1 and vice versa. If the feedback does not catch up and > 1 or < -1 correction is needed then one sample of your system clock has to be dropped or new sample recreated so that for example a Mu of 1.1 wraps up to .1 and that of -1.1 wraps to -.1

So I believe you are referring to the case of clock recovery which is fully digital i.e. you have no control over received samples (no control on ADC point) but you recover clock rate from given stream of symbols.

The issue of making your rx clock slightly faster than tx clock will help set the error sense but the drift of oscillators is not always predictable.

Altera_Forum · ‎10-17-2012

--- Quote Start ---

The receiver's PLL locks to the incoming data rate so why does it matter if there are ppm differences in the reference clocks?

--- Quote End ---

PLL stands for phase Locked Loop and not Frequency (i.e. data rate) Locked Loop.

It means the RX's PLL is able to counteract an improper phase shift between the incoming bit stream and the RX recovered local clock that would possibly lead to sample the data while it is changing (a Setup/Hold Time violation in other words).

Regarding the "ppm differences", this expression refers to small differences between the frequency of the data transmitter and the frequency of the data receiver (TX and RX data rates are different if you prefer). Now, imagine you are looking in the time domain at two bit streams that have a slight difference in their data rates. You will see that they are slowly sliding one relative to the other. More over, the more their data rates are different, the more they will slide rapidly. This is simply the visual traduction of the fact that a frequency difference between two signals means their phase difference is continuously and constantly increasing along with time.

Finally, if your incoming bit stream is continuously sliding with respect to the RX sampling clock, you will periodically skip one bit or more depending on the rate difference. Therefore, you need a time buffer(a FIFO) to avoid loosing some data bits and the more the data rates will be different, the deeper your FIFO will need to be, just as the deeper your bathtub will be, the later it will overflow if you forget to close the water tap ;)

Altera_Forum · ‎10-17-2012

Yes the P in PLL stands for 'phase' but when a PLL locks to the phase of the input, this means it is also frequency locked (see the second paragraph of the Wikipedia PLL article, unfortunately I can't post the link).

Since the Rx-PLL will be frequency locked to the incoming data stream I still don't understand how any slippage/drifting can occur and why the rate-matching FIFO is needed.

Altera_Forum · ‎10-17-2012

I think you'll find that PLL do adjust their frequency to match that of the incoming signal.

The 'skip' codes probably allow you to run most of your logic at a fixed frequency, and only the line receiver at the frequency of the incoming signal.

The 'skip' code would then be added/deleted before the rx fifo.

Plausibly they could also be added when the tx side would otherwise underrun.

PCIe may well require that you transmit with your own clock, not one recovered from the receive signal - one end has to do that anyway, and symmetry is useful.

You also (probably) want to be able to transmit when rhe rx signal is absent.

Altera_Forum · ‎10-28-2014

--- Quote Start ---

Yes the P in PLL stands for 'phase' but when a PLL locks to the phase of the input, this means it is also frequency locked (see the second paragraph of the Wikipedia PLL article, unfortunately I can't post the link).

Since the Rx-PLL will be frequency locked to the incoming data stream I still don't understand how any slippage/drifting can occur and why the rate-matching FIFO is needed.

--- Quote End ---

It is needed to decouple the processing of the incoming data stream which is in the recovered clock domain as you point out correctly, and the read side of the FIFO which is typically in a different clock domain different both in phase and frequency. The reason for rate-matching is because once you start processing the data typically in blocks, you do not want the processing to be interrupted by the read fifo going empty - it makes the pipeline design easier knowing that there is no "hole" in the incoming data. One could just stay in the recovered clock domain but then if the data goes away, the clock wanders of drifts and you may even get glitches leading to unpredictable behavior. So the read side clock is kept independent of the recovered clock. And of course, since the read side clock is independent you can imagine how slippage or drifting can occur. The short term frequency of the read clock can be made faster so as to prevent an overflow in the FIFO, but the problem is that it leads to the read fifo going empty and the pipeline needing to be designed to tolerate that. No harm, just more work. I hope this helps.

Altera_Forum · ‎05-03-2018

I am using the custom phy IP (Cyclone V SX ) and had opted for the "automatic state machine", rate matched FIFO, 8b/10b, byte ordering. I setup the design in the QUARTUSII environment ( not in QSYS ). The design always works when performing an external and internal loop-back test ( even through a foot of backplane :)) , if I disconnect, it recovers ( I love that automatic state machine ) without any issues.

The problem arises when I communicate with another module ( which has the same exact PHY design ). The rate matched FIFO detects the small variations in the clock between the TX from module and the RX from the other, and starts to perform its function of deleting and inserting symbols. Since, I opted for byte ordering; the byte order is be lost due to the Rate matched Fifo's inserting and deleting. And my error counter then goes up to 95% error receive rate. There is a note in the XCVR manual that byte ordering is compromised by the Rate matched FIFO.

So, I removed the Rate matched FIFO function and my error counter is zero...I have yet to perform more testing, but so far so good. I do not have any TX/RX errors ( I am crossing my fingers as I type ). So for my case insofar I have not had the need for the Rate Matched FIFO. It is pretty cool that the PHY can detect such slight variations among the different modules...neat! Gotta love Altera for that...sorry INTEL :-P