GXB:tranceiver transmission time issue

Altera_Forum · ‎08-06-2013

Hi dears:

I have a board which has two same FPGA. Each FPGA has one transmitter. These two FPGA are configured with same configuration file. I use aonther board which has only one FPGA to receive these two transmitter channels. I did a test, transmited data in turn. I used signaltap to check these two different GXB channel and found that they arrive at different time. Every cycle may be different, sometimes may be aligned. sometimes, may several clock cycles ahead or later.

Is it possible that the data propagation times inside different GXB links are different?

I attached two jpg files, the first shows these two receivers have 6 clock cycles differ each other, it's strange that why are there anoter more 4 clock cycles time added? The socond shows these two receivers have 2 clock cycles differ each other (It's ok, for I make this two clock cycles difference in transmitter side).

Also, I found the delayed channel sometimes could arrive early than the aonther channel! Very strange.

Altera_Forum · ‎08-06-2013

This is "normal" for GXB signals.

GXB transmitters and receivers serialize and deserialize data. The problem with these shift-registers is that you cannot really start or stop them synchronously at both the source and destination, since they're in different chips, and the bit-streams are at Gbps data rates. The "solution" to get around this lack of synchronization is to introduce a protocol for the data sent over the link. For example, send a known pattern over the link, and align each of the receivers to output parallel words based on that pattern - pseudo-random binary sequences are one method of doing this. Protocols like Gigabit Ethernet use 8/10B and 64/66B encoding and they use special lane synchronizing characters, so that you can align the 8-bit output data on one lane, and they use channel bonding characters so that you can align the data across multiple lanes.

If you control both ends of the serial link, then the easiest solution is to use 8/10B encoding, and then use the frame synchronization controls provided by Altera's IP cores.

Cheers,

Dave

Altera_Forum · ‎08-07-2013

--- Quote Start ---

This is "normal" for GXB signals.

GXB transmitters and receivers serialize and deserialize data. The problem with these shift-registers is that you cannot really start or stop them synchronously at both the source and destination, since they're in different chips, and the bit-streams are at Gbps data rates. The "solution" to get around this lack of synchronization is to introduce a protocol for the data sent over the link. For example, send a known pattern over the link, and align each of the receivers to output parallel words based on that pattern - pseudo-random binary sequences are one method of doing this. Protocols like Gigabit Ethernet use 8/10B and 64/66B encoding and they use special lane synchronizing characters, so that you can align the 8-bit output data on one lane, and they use channel bonding characters so that you can align the data across multiple lanes.

If you control both ends of the serial link, then the easiest solution is to use 8/10B encoding, and then use the frame synchronization controls provided by Altera's IP cores.

Cheers,

Dave

--- Quote End ---

Dear Dave:

Thanks for you quickly reply! Several further questions need your help:

1. Do you mean this issue basicly because of these two lanes are in different chips? Otherwise, how can we implement pcie-x4 in single chip, am i right?

2. I am trying to understand your "solution". I use basic mode, but i enable 8b/10b in megawizard. Is it possible for me to solve this issue without changing current

gxb settings? If this is the case, can you point it to me which is of the "frame synchronization controls"? I don't use any 8b/10b control signals in my previous designs.

3. Why they are diffrent so big, with Max difference of about 10 parallel clcok cycles(for my system it's about 100 ns , 10 100Mhz cycles)? Assume my logic process parallel data with pipelining, there is no difference process timing.

4. Though these 2 GXBs in different chips, still i think the time difference should be a constant value. But according my first post, it act like some ramdom value. Very strange. If this is the case, it seems no way to align them?!

Best Regards

Jerry

Altera_Forum · ‎08-07-2013

--- Quote Start ---

1. Do you mean this issue basicly because of these two lanes are in different chips? Otherwise, how can we implement pcie-x4 in single chip, am i right?

--- Quote End ---

The same issue occurs if you use one chip. For example, lets say you have an HSMC connector adapter which loops back the Gbps transmitters and receivers. If you took a common signal, eg., a PRBS7 sequence, and sent it to all 8 transmitters, then looked at the output of all 8 receivers, they will all be different, with an arbitrary offset at the single-bit level. You can configure the ALTGX block in the Stratix IV series devices to synchronize to a start pattern.

If you perform a test between two boards, then you can add another complication; different transmitter reference clocks.

--- Quote Start ---

Otherwise, how can we implement pcie-x4 in single chip, am i right?

--- Quote End ---

PCIe x4 "works" because the transceiver interface defines 10bit codes for "channel bonding". If you are creating a custom Gbps link then you can use the same technique.

--- Quote Start ---

2. I am trying to understand your "solution". I use basic mode, but i enable 8b/10b in megawizard. Is it possible for me to solve this issue without changing current

gxb settings? If this is the case, can you point it to me which is of the "frame synchronization controls"? I don't use any 8b/10b control signals in my previous designs.

--- Quote End ---

The appropriate solution depends on what you are trying to do. Until you explain that, its a bit difficult to provide suggestions.

--- Quote Start ---

3. Why they are diffrent so big, with Max difference of about 10 parallel clcok cycles(for my system it's about 100 ns , 10 100Mhz cycles)? Assume my logic process parallel data with pipelining, there is no difference process timing.

--- Quote End ---

You're looking at data in the parallel clock domain of the FPGA. There are many layers of FIFOs and clock-domain crossing logic, with many opportunities to lose clocks when looking at multiple transceiver channels.

Cheers,

Dave

Altera_Forum · ‎08-07-2013

--- Quote Start ---

The same issue occurs if you use one chip. For example, lets say you have an HSMC connector adapter which loops back the Gbps transmitters and receivers. If you took a common signal, eg., a PRBS7 sequence, and sent it to all 8 transmitters, then looked at the output of all 8 receivers, they will all be different, with an arbitrary offset at the single-bit level. You can configure the ALTGX block in the Stratix IV series devices to synchronize to a start pattern.

If you perform a test between two boards, then you can add another complication; different transmitter reference clocks.

PCIe x4 "works" because the transceiver interface defines 10bit codes for "channel bonding". If you are creating a custom Gbps link then you can use the same technique.

The appropriate solution depends on what you are trying to do. Until you explain that, its a bit difficult to provide suggestions.

You're looking at data in the parallel clock domain of the FPGA. There are many layers of FIFOs and clock-domain crossing logic, with many opportunities to lose clocks when looking at multiple transceiver channels.

Cheers,

Dave

--- Quote End ---

Hi Dave:

My project, one board using two FPGA (ArriaGX) processing "pulse" events, which have ramdom timestamp. Another board using single FPGA (Stratix IVGX) receive the processed data and sort those events according timestamp.

Given a GXB channel, if each event transmission with ramdom offset, it's difficult for StratixIV to sort those events. If the offset is a constant value after power-on, it seems i could make padding to compensate this transmission offset in different lanes. But, as you said, different lanes should have arbirary offset. Also according my first post JPG shows, it should be ramdom offset. I really don't know how to align them?

Please give me some guidelines. Thanks in advance!

Jerry

Altera_Forum · ‎08-07-2013

Hi Jerry,

--- Quote Start ---

My project, one board using two FPGA (ArriaGX) processing "pulse" events, which have ramdom timestamp. Another board using single FPGA (Stratix IVGX) receive the processed data and sort those events according timestamp.

--- Quote End ---

This explanation is not clear, so let me ask it more directly;

Are you trying to use multiple transceiver receiver channels to capture pulse events?

--- Quote Start ---

Given a GXB channel, if each event transmission with ramdom offset, it's difficult for StratixIV to sort those events.

--- Quote End ---

Why are you using a second FPGA, when the Arria could do the work? If you must send data between the two FPGAs, then the Arria should create data packets, and use a "standard" transport protocol between it and the Stratix device.

--- Quote Start ---

If the offset is a constant value after power-on, it seems i could make padding to compensate this transmission offset in different lanes. But, as you said, different lanes should have arbirary offset. Also according my first post JPG shows, it should be ramdom offset. I really don't know how to align them?

--- Quote End ---

You need a "master reference" signal.

For example, lets say you had 4 receiver channels that you needed to align. One way to align them is to connect them to a common source, eg., a single transmitter channel that is split 4 ways. Since the transmitter channel is sending a common signal to all four receivers, then any misalignment in the receivers must be due to their receiver deserialization logic. So you "swallow" bits until they align.

Your measurement system input needs to incorporate a multiplexer so that you can look at the transmitter channel after power-on to align the receivers, and then switch to the pulsed signals that you are trying to measure and timestamp.

Cheers,

Dave

Altera_Forum · ‎08-07-2013

--- Quote Start ---

Hi Jerry,

This explanation is not clear, so let me ask it more directly;

Are you trying to use multiple transceiver receiver channels to capture pulse events?

Why are you using a second FPGA, when the Arria could do the work? If you must send data between the two FPGAs, then the Arria should create data packets, and use a "standard" transport protocol between it and the Stratix device.

You need a "master reference" signal.

For example, lets say you had 4 receiver channels that you needed to align. One way to align them is to connect them to a common source, eg., a single transmitter channel that is split 4 ways. Since the transmitter channel is sending a common signal to all four receivers, then any misalignment in the receivers must be due to their receiver deserialization logic. So you "swallow" bits until they align.

Your measurement system input needs to incorporate a multiplexer so that you can look at the transmitter channel after power-on to align the receivers, and then switch to the pulsed signals that you are trying to measure and timestamp.

Cheers,

Dave

--- Quote End ---

Hi Dave:

1. My project doesn't use GXB process pulse event. GXB is used to transmit&receive data. The pulse like "trigger" signal, which enable my logic to process the data from ADC. There also has one TDC channel used alone with each ADC channel. The TDC is used to measure the pulse's "arrival time", which will get a timestamp as i said. The trasmitter in ArriaGX send the data combing with the processed ADC data and the TDC's measuring result together to StratixIVGX. So there seems be one event data should be sent out from ArriaGX to StratixIVGX when each pulse come. Because all data process stages are synchronized, i think each ADC channel's data should arrive at StartixIVGX at the same time if the pulse arrive at dfferent ArriaGX at the same time.

2. Because there are total 22 reciever channels, so the another board must use StratixIVGX. Actually, we have 11 same ArriaGX boards, each board solders two same ArriaGX. Each ArriaGX implements one channel transmitter, these 11 ArriaGX total have 22 transmitter channels. This is why we use one StratixIV chip to hold all 22 channels' data.

3, I have done a simple test. Different transmitters in different chips, send data at the same time, the receiver side will get data with offset between these two channel. But it seems the offset is constant after power-on or reset. The offset may vary if you reset or power-on again. But, I don't know if this can prove the offset should be alsways constant after power-on? It shouldn't be constant according our real system test, however, i watch long time using simple project thru SignalTap and get a constant value offset.

Altera_Forum · ‎08-07-2013

I attached two STP files, which save my test results. These two STP files, one is of ArriaGX, transmitter, another is of StratixIVGX, the receiver. Because i only have one transmiter board, and two ArriaGX chips on single transmitter board chained in one daisy chain, so i can only monitor one ArriaGX at the same time.

The test, i did, force the second ArriaGX send data later than the first one for 2 clock cycles. Total 100000 datas are sent. I analyze the data after the receiver board save it in disk, and found there is one pair events received with time disorder. This time disorder can be found in the receiver's STP file. The second arriaGX data arrive at SIVGX later than the first arriaGX except the 14th event, which can be found at the time bar position of the STP.

This seems that the offset between two different GXB isn't a constant value after power-on. It act as mostly like "jitter".

Another attachment JPG files show the first event and the "issue" event.

Altera_Forum · ‎08-07-2013

--- Quote Start ---

My project doesn't use GXB process pulse event.

--- Quote End ---

Ok.

--- Quote Start ---

GXB is used to transmit&receive data.

--- Quote End ---

So what protocol are you using? Since you control the transmit and receive, you get to define what gets used.

--- Quote Start ---

The pulse like "trigger" signal, which enable my logic to process the data from ADC.

--- Quote End ---

Where does this signal come from, and how do you ensure your ADCs all use a common clock reference? Is the pulse synchronous to the ADC clock reference?

--- Quote Start ---

There also has one TDC channel used alone with each ADC channel. The TDC is used to measure the pulse's "arrival time", which will get a timestamp as i said.

--- Quote End ---

These timestamps should be system-level synchronous, and synchronous to the master clock.

For example, in my systems we use GPS units with a 1pps output and 10MHz reference. The 10MHz reference is then the synthesizer reference for several hundred 1GHz ADCs.

--- Quote Start ---

The trasmitter in ArriaGX send the data combing with the processed ADC data and the TDC's measuring result together to StratixIVGX. So there seems be one event data should be sent out from ArriaGX to StratixIVGX when each pulse come. Because all data process stages are synchronized, i think each ADC channel's data should arrive at StartixIVGX at the same time if the pulse arrive at dfferent ArriaGX at the same time.

--- Quote End ---

You have not explained how anything is synchronous or synchronized yet.

--- Quote Start ---

2. Because there are total 22 reciever channels, so the another board must use StratixIVGX. Actually, we have 11 same ArriaGX boards, each board solders two same ArriaGX. Each ArriaGX implements one channel transmitter, these 11 ArriaGX total have 22 transmitter channels. This is why we use one StratixIV chip to hold all 22 channels' data.

--- Quote End ---

This can be made to work.

--- Quote Start ---

3, I have done a simple test. Different transmitters in different chips, send data at the same time, the receiver side will get data with offset between these two channel. But it seems the offset is constant after power-on or reset. The offset may vary if you reset or power-on again. But, I don't know if this can prove the offset should be alsways constant after power-on? It shouldn't be constant according our real system test, however, i watch long time using simple project thru SignalTap and get a constant value offset.

--- Quote End ---

Whether or not the receiver channel offsets stay the same depends on how you clock your transceivers. If you use a global clock signal, then both the transmitter and receiver will have the same frequency, and the offsets will not change with time after power-on. This means you can power-up your system, send a test pattern through the network, determine offsets, delete bytes until the transmit-to-receive links all have the same delays, and then "go".

If however you are using independent oscillators on each of your boards, then some oscillators will be slightly faster or slower than others, and eventually your receive FIFOs will over- or under-flow. There are protocols that can deal with this, which they do by inserting or removing 10-bit stuff codes. Your system should not be using these types of protocols.

Cheers,

Dave

Altera_Forum · ‎08-08-2013

--- Quote Start ---

Ok.

So what protocol are you using? Since you control the transmit and receive, you get to define what gets used.

Where does this signal come from, and how do you ensure your ADCs all use a common clock reference? Is the pulse synchronous to the ADC clock reference?

These timestamps should be system-level synchronous, and synchronous to the master clock.

For example, in my systems we use GPS units with a 1pps output and 10MHz reference. The 10MHz reference is then the synthesizer reference for several hundred 1GHz ADCs.

You have not explained how anything is synchronous or synchronized yet.

This can be made to work.

Whether or not the receiver channel offsets stay the same depends on how you clock your transceivers. If you use a global clock signal, then both the transmitter and receiver will have the same frequency, and the offsets will not change with time after power-on. This means you can power-up your system, send a test pattern through the network, determine offsets, delete bytes until the transmit-to-receive links all have the same delays, and then "go".

If however you are using independent oscillators on each of your boards, then some oscillators will be slightly faster or slower than others, and eventually your receive FIFOs will over- or under-flow. There are protocols that can deal with this, which they do by inserting or removing 10-bit stuff codes. Your system should not be using these types of protocols.

Cheers,

Dave

--- Quote End ---

Hi Dave:

1. I selected basic mode in MegaWizard when i implement GXB megacore. So we don't use any standard protocol for we only send data from point-to-point.

2. Each channel process one analog signal. Which is divided into two parts on another FEE board. One part is used to generate ADC input signal, another is used to generate TDC input signal(trigger signal, just filter the analog signal's leading edge). So if two channels have events happening at the same time, they should have same timestamp. And they should arrive at StratixIVGX board at the same time in real. But, there may have some non-pipeline process which cause these two evnets can't arrive at StratixIVGX at the same. We must take care of these non-pipeline issues (such as the offset of different GXB).

3. Except FEE board, all digitized boards use same oscillator source from backplane board. It means all ArriaGX baords and StratixIVGX board are plugged on single backplane board. The hardware clock tree have has several clock distributor chips on backplane and each sub-board.

3

Thanks

Jerry

Altera_Forum · ‎08-08-2013

Hi Jerry,

You need to investigate your data transport in a little more detail. You also need to document those tests. Here's an example for an ADC I am testing;

http://www.ovro.caltech.edu/~dwh/wbsddc/hittite_adc_hw.pdf

In this case, I had no choice but to use a PRBS generator for lane-to-lane synchronization. In your case you can use 8/10B encoding and create logic to align lanes.

You need to focus on getting your data transport between multiple FPGAs synchronous, and aligning their pipeline delays after power-on reset.

Create a Modelsim simulation with two FPGAs, multiple transceiver signals, delay the transceiver signals by different amounts (more than a few bit periods), and then try to resynchronize, i.e., align the parallel output words, in the receiver FPGA.

Cheers,

Dave

Altera_Forum · ‎08-09-2013

--- Quote Start ---

Hi Jerry,

You need to investigate your data transport in a little more detail. You also need to document those tests. Here's an example for an ADC I am testing;

http://www.ovro.caltech.edu/~dwh/wbsddc/hittite_adc_hw.pdf

In this case, I had no choice but to use a PRBS generator for lane-to-lane synchronization. In your case you can use 8/10B encoding and create logic to align lanes.

You need to focus on getting your data transport between multiple FPGAs synchronous, and aligning their pipeline delays after power-on reset.

Create a Modelsim simulation with two FPGAs, multiple transceiver signals, delay the transceiver signals by different amounts (more than a few bit periods), and then try to resynchronize, i.e., align the parallel output words, in the receiver FPGA.

Cheers,

Dave

--- Quote End ---

Hi Dave:

Can guide me how to simulate two FPGA? According my understant, one testbench should run inside single chip!

Altera_Forum · ‎08-09-2013

--- Quote Start ---

Can guide me how to simulate two FPGA? According my understant, one testbench should run inside single chip!

--- Quote End ---

The simulator does not know about chips, it just knows about HDL code, so your simulation of two FPGAs is a top-level design containing two instances of a transceiver block with TX1->RX2 and RX1->TX2.

You could also think of this as one chip with two transceiver blocks connected.

Cheers,

Dave

Altera_Forum · ‎08-10-2013

--- Quote Start ---

The simulator does not know about chips, it just knows about HDL code, so your simulation of two FPGAs is a top-level design containing two instances of a transceiver block with TX1->RX2 and RX1->TX2.

You could also think of this as one chip with two transceiver blocks connected.

Cheers,

Dave

--- Quote End ---

Ok, i think i understand you!