FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6343 Discussions

E-Tile PTP timestamping issues

alexforencich
Novice
1,730 Views

I am running into some odd issues with PTP timestamping on the E-Tile on both Stratix 10 DX (Intel S10 DX dev kit) and Agilex F (DE10-Agilex).

 

Ultimately, what I want to be able to do is run a bunch of 25G+RS-FEC+PTP channels, with the ability to enable/disable RS-FEC and switch between 10G/25G on any combination of channels and have PTP timestamps provided for all TX and RX packets (2-step only).  I also want to build a separate design to do something similar for 100G.  But, for now I am starting out with 10G+PTP just to verify that everything works. 

 

I generated a 4 channel core in "100GE or 1 to 4 10GE/25GE Channels with optional RS-FEC and PTP" mode, with the link rate set to 10G.  I connected the core according to the documentation (o_clk_pll_div64[4] feeding all of the TX and RX clocks, and o_clk_pll_div66[3:0]/o_clk_rec_div66[3:0] feeding the PTP CDC modules and PTP TOD clocks).  Since I need PTP TX timestamps for all packets, I tied i_sl_ptp_ts_req high.  I also have a state machine to drive the transceiver reconfig interface to set up equalization so the links actually come up.  I managed to get the core operational in terms of sending and receiving packets at full line rate, as well as PTP time sync with an unloaded link.  But, I am seeing some strange behavior with RX PTP timestamps when the TX side of the link is loaded.  The problem is definitely not in the core logic of the design, because the exact same code with my own soft MAC works fine on many other FPGA boards, both Xilinx and Intel. 

 

The jist of it is this: when the link is lightly loaded, ptp4l works fine, and can time sync to well under 100 ns.  But, when the TX side of the link is loaded up to full 10G line rate, ptp4l freaks out. 

 

ptp4l output with lightly-loaded link, or with RX side of link loaded at full line rate:

 

 

 

ptp4l[14517.872]: rms    3 max    7 freq +31715 +/-   4 delay   177 +/-   0
ptp4l[14518.872]: rms    5 max    8 freq +31709 +/-   4 delay   178 +/-   0
ptp4l[14519.873]: rms    2 max    4 freq +31716 +/-   1 delay   178 +/-   0
ptp4l[14520.873]: rms    4 max    5 freq +31721 +/-   2 delay   178 +/-   0
ptp4l[14521.874]: rms    2 max    3 freq +31717 +/-   3
ptp4l[14522.874]: rms    5 max    8 freq +31722 +/-   5 delay   177 +/-   0
ptp4l[14523.874]: rms    3 max    7 freq +31723 +/-   4 delay   178 +/-   0
ptp4l[14524.875]: rms    2 max    3 freq +31720 +/-   2 delay   178 +/-   0
ptp4l[14525.875]: rms    3 max    3 freq +31726 +/-   1 delay   178 +/-   0
ptp4l[14526.875]: rms    2 max    3 freq +31726 +/-   2 delay   178 +/-   0
ptp4l[14527.876]: rms   14 max   16 freq +31700 +/-   8 delay   178 +/-   0
ptp4l[14528.876]: rms    8 max   14 freq +31696 +/-   4 delay   178 +/-   0

 

 

 

ptp4l output when I start iperf or netperf to load the TX side of the link:

 

 

 

ptp4l[13976.562]: rms    6 max   10 freq +31752 +/-   5 delay   170 +/-   0
ptp4l[13977.562]: rms    5 max    7 freq +31738 +/-   3
ptp4l[13978.562]: rms    7 max   12 freq +31730 +/-   6 delay   171 +/-   0
ptp4l[13979.563]: rms    6 max    8 freq +31723 +/-   2
ptp4l[13980.563]: rms    4 max    6 freq +31724 +/-   4 delay   170 +/-   0
ptp4l[13981.565]: rms    2 max    4 freq +31725 +/-   3 delay   170 +/-   0
ptp4l[13982.564]: rms    6 max    8 freq +31736 +/-   3
ptp4l[13983.564]: rms    3 max    7 freq +31730 +/-   4 delay   170 +/-   0
ptp4l[13984.564]: rms    3 max    6 freq +31725 +/-   2 delay   171 +/-   0
ptp4l[13985.565]: rms    1 max    3 freq +31729 +/-   2 delay   170 +/-   0
ptp4l[13986.565]: rms    3 max    4 freq +31734 +/-   2 delay   171 +/-   0
ptp4l[13987.566]: rms    3 max    5 freq +31729 +/-   4 delay   171 +/-   0
ptp4l[13988.566]: rms    2 max    4 freq +31732 +/-   3 delay   170 +/-   0
ptp4l[13989.566]: rms    3 max    5 freq +31735 +/-   3 delay   171 +/-   0
ptp4l[13990.567]: rms    4 max    6 freq +31739 +/-   3
ptp4l[13991.569]: rms    1 max    2 freq +31737 +/-   1 delay   169 +/-   0
ptp4l[13992.567]: rms 5931645 max 16777227 freq -2981532 +/- 7972308 delay   171 +/-   0
ptp4l[13993.568]: rms 13296276 max 24950792 freq -8791952 +/- 17607933 delay   172 +/-   0
ptp4l[13993.943]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[13993.943]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[13994.193]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[13994.443]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[13994.443]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[13994.568]: rms 37915595 max 71180908 freq -61956841 +/- 24467336
ptp4l[13994.568]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[13995.194]: clockcheck: clock jumped backward or running slower than expected!
ptp4l[13995.194]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[13995.319]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[13995.569]: clockcheck: clock jumped backward or running slower than expected!
ptp4l[13995.569]: rms 39020665 max 67939760 freq +13478253 +/- 36947291
ptp4l[13995.694]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[13996.569]: rms 25658242 max 36598353 freq -35742409 +/- 17569374 delay   172 +/-   0
ptp4l[13996.821]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[13996.821]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[13996.944]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[13997.569]: rms 80210819 max 103599354 freq +51717324 +/- 55408814
ptp4l[13998.570]: rms 17580185 max 29116723 freq +30207032 +/- 21404769 delay   172 +/-   0

 

 

 

I did some work to try to isolate exactly what was going on here - what conditions would trigger the issue, and to get a better idea of exactly what the issue is. 

Some testing with iperf and UDP netperf indicate that this only occurs when the TX link is saturated.  Saturating the RX link appears to have no effect, and ptp4l operates normally. 

To isolate the problem to RX vs TX timestamps being affected, I lowered the delay request rate to 1 every 8 seconds.  The result is the following, ptp4l freaks out between delay requests, so this indicates that there must be some problem with the RX timestamps:

 

 

 

ptp4l[14624.990]: rms    4 max    7 freq +31718 +/-   4
ptp4l[14625.991]: rms    5 max    7 freq +31711 +/-   3
ptp4l[14626.991]: rms   12 max   14 freq +31694 +/-   7
ptp4l[14627.991]: rms    5 max   10 freq +31697 +/-   5
ptp4l[14628.992]: rms   17 max   24 freq +31728 +/-  14
ptp4l[14629.992]: rms   18 max   27 freq +31750 +/-   5
ptp4l[14630.992]: rms    3 max    5 freq +31730 +/-   3
ptp4l[14631.993]: rms    6 max   10 freq +31721 +/-   4
ptp4l[14632.993]: rms    7 max   10 freq +31714 +/-   4
ptp4l[14633.994]: rms    6 max    8 freq +31708 +/-   1
ptp4l[14634.994]: rms    2 max    3 freq +31716 +/-   2
ptp4l[14635.994]: rms    2 max    4 freq +31717 +/-   3 delay   179 +/-   0
ptp4l[14636.995]: rms    2 max    3 freq +31715 +/-   2 delay   179 +/-   0
ptp4l[14637.995]: rms    2 max    4 freq +31715 +/-   3
ptp4l[14638.996]: rms    2 max    3 freq +31714 +/-   2
ptp4l[14639.996]: clockcheck: clock jumped backward or running slower than expected!
ptp4l[14639.996]: rms 35705352 max 94338668 freq -16234332 +/- 18693703
ptp4l[14639.996]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[14640.121]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[14640.371]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[14640.996]: rms 46968261 max 77937090 freq -38852541 +/- 27270782
ptp4l[14641.997]: rms 48110793 max 55270185 freq +21331970 +/- 6536553
ptp4l[14642.997]: rms 23710743 max 33965198 freq +23975581 +/- 3057741
ptp4l[14643.997]: rms 5326808 max 10009657 freq +11328001 +/- 3569445
ptp4l[14644.998]: rms 2746413 max 3324736 freq +2025728 +/- 1732478
ptp4l[14645.998]: rms 2818585 max 3289068 freq -1313456 +/- 340920
ptp4l[14646.998]: rms 1342013 max 1943718 freq -1368886 +/- 193622
ptp4l[14647.999]: rms 23896054 max 50355276 freq -18576752 +/- 30977102

 

 

 

One additional possibility that I could think of would be smaller-than-min-length TX frames (which would be padded to min length by the MAC) could potentially be causing problems.  I made some adjustments to the device driver to pad all frames out to at least 60 bytes, but this did not change the behavior.  And some additional testing confirmed that ptp4l getting screwed up was not coincident with short frames being sent, which further rules out this possibility. 

 

So, as far as I can tell, there are basically two possibilities at this point.  Either 1. I am doing something wrong in terms of how the core is configured or connected (maybe there is some register that comes up with an insane value out of reset, and some poking via AVMM is required for things to work, or maybe I missed a config setting in the UI), or 2. there is a bug in the E-Tile or in the soft interface logic.  Has the E-tile been verified to work correctly in 4-lane 10G/25G mode with i_sl_ptp_ts_req tied high?  Best I can figure at this point is that there is some sort of interaction between the TX and RX paths either in the AIB interface soft logic, or in the E-Tile itself, that's triggered by adjacent frames requesting TX timestamping and results in screwed up RX timestamps.

0 Kudos
11 Replies
alexforencich
Novice
1,702 Views

I just finished testing with both "basic mode" and "advanced mode" - same results.  Works fine when the link is lightly loaded, RX timestamps break when the TX link is loaded at line rate.

0 Kudos
Paveetirra_Srie
Employee
1,690 Views

Hi Alex,


I have read through your prompt explanation. Did you try to generate example design based on your IP setting?

If no, I would suggest you to generate design example and compare with your current design. It will help us to find the missing piece between the design example and your own design.


Regards,

Pavee


0 Kudos
alexforencich
Novice
1,687 Views

I can generate one, but what should I be looking for?

 

Anyway, the full project source code is here: https://github.com/alexforencich/corundum/tree/master/fpga/mqnic/DE10_Agilex/fpga_25g

The exported IP TCL files are here: https://github.com/alexforencich/corundum/tree/master/fpga/mqnic/DE10_Agilex/fpga_25g/ip .  Both the 10G and 25G variations present the same issue.

The E-Tile MACs are instantiated in this module: https://github.com/alexforencich/corundum/blob/master/fpga/mqnic/DE10_Agilex/fpga_25g/rtl/eth_mac_dual_quad_wrapper.v

 

0 Kudos
Paveetirra_Srie
Employee
1,662 Views

Hi Alex,


Good day.


I have checked the UG for E-Tile Hard IP for Ethernet Intel FPGA IP Core. Since you're facing the issue on PTP timestamping,

I believe we have guideline for PTP. You can refer to 2.9.2.14. PTP System Considerations. Kindly check if those guidelines have been took into consideration in your design.


Regards,

Pavee


0 Kudos
Paveetirra_Srie
Employee
1,662 Views

Hi ,


Regarding design example, you can try to verify whether the same issue happen on design example.

You may try to load up the TX side of the link to full 10G line rate and observe the ptp4l behavior.


Kindly do let me know if it helps.


Regards,

Pavee


0 Kudos
alexforencich
Novice
1,631 Views

Is there a design example that connects the E-tile to ptp4l in some way?  If you can point me at something that I can try running on my DE10-Agilex board, I would be happy to give it a shot. 

0 Kudos
Paveetirra_Srie
Employee
1,608 Views

Hi Alex,


I have tried to found example design UG for E-tile that connects to PTP.

Attaching the link here. Hope it helps you.

https://www.intel.com/content/www/us/en/docs/programmable/683860/21-3/about-e-tile-hard-ip-design-example.html3


Do let me know if you need further support.


Regards,

Pavee


0 Kudos
Paveetirra_Srie
Employee
1,593 Views

We do not receive any response from you to the previous reply that I have provided. This thread will be transitioned to community support. 

If you have a new question, feel free to open a new thread to get the support from Intel experts. 

Otherwise, the community users will continue to help you on this thread. 

Thank you.


0 Kudos
jiez
Novice
996 Views

@alexforencichHi Alex, have you found a solution for this issue? Thanks.

0 Kudos
alexforencich
Novice
898 Views

No, I have not had the time to dig into this further, nor to test it on the latest version of Quartus to see if it might have been fixed.

0 Kudos
jiez
Novice
888 Views

@alexforencich  Thank you for your reply. We just found out that the similar issue we saw might be related to the host NIC. E810 card seems does not have this issue but XXV710 card has.

0 Kudos
Reply