- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I use a cyclone 10 gx fpga (10CX150YF780C5G) on my pcie card. One transceiver is used as a 10Gbps fiber link and other 4 transceivers are used as Gen1 Pcie X4 lanes. No external memory interface. The total power consumption is about 2W.
The problem is when fpga's die temperature rises above about 55C, the transceiver that is used to connect the fiber link(SFP) has occurred missing word error in transmitting side. I use the transceiver loopback function to monitor the data both at a remote receiver output and at the local loopback receiver output. They show exactly same fault. It tells us the PCS stage in the transmitter side has something wrong. The reference clock to the transceiver ATX-pll is 644.53125MHz. Other functions in the fpga is working well.
When I lower the temperature by add a fan to blow it, it works normally. Is there any one knowing how to fix it?
Thanks a lot!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
In order to better understand the problem, do you mean if you enable the internal serial loopback (no using cable), the same problem is observed if the temperature rises above 55C, while the PCIe and other functions are still working well?
How many devices have a similar problem? Could you please try to test on another channel to determine if there is any channel dependency on this particular device?
Regards -SK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You are right. Using the internal loopback(from transmitter's PCS output loopback to input of receiver's PCS) at the xgmii interface of the transceiver we get wrong data(mainly losing a 64-bit word in a packet) when operation temperature rises to 55C. Since when you loopback, the data actually is sent to the remote receiver over the fiber link too, so we can see at the remote receiver output that we get the same loss of data as the loopback output.
For the design, we get clean timing analyzer result without any timing violations.
We have total five boards. They fail at different temperature, the lowest is 55C and the highest is 72C not reaching to the specified 100C. In board design we only use one transceiver as a 10Gbps fiber link. It may not easy to switch to other transceiver in the FPGA to test other channels.
Thanks. Looking forward to your further helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
If enable the serial loopback can see the problem, you should be able to test on other channels as well since it does not use the PHY channel (fiber cable), you can just change the Pin assignment will do.
Do you have a signal tap that can show the pass and fail condition?
Regards -SK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes I have.
See attached. It shows the data communication at xgmii interface.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Can you provide me a simplified design that only consists of 1 channel of 10G that can replicate the issue on your hardware, so that I can have a better understanding of what is the setting in the transceiver and 10G MAC IP?
Besides, please ensure your board design has met the Pin Assignment Guideline, especially the power supply, and transceiver pins.
https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/dp/cyclone-10/pcg-01022.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The board works well under the temperature of 55C, so i think the pcb design should be ok and pins' assignment does not look having any problem so far.
For your further review, I can archive my fpga design and email it to you instead of post here. Please give me an email address. Does it sound right?
Thanks a lot for your helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please refer to the following link (table 1), and check the GXB power supply (e.g. Vcct_GXB, Vccr_GXB, and Vcch_GXB), and determine if there is any difference between a pass and fail case. The FPGA is supposed to work at above 55C, and since you encounter it on multiple boards, so it is suspicious if there is a board issue.
Besides, you can send me a private message with the simple design, so that I can have a sanity check on it. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have checked the GXB powers. They are all within the range on table1, and unchanged with the temperature variation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can click on the Message icon (top right), and then click compose to create new message, from the “send to”, you can find my name over there.
Regards -SK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found that there is a setup timing violation in your design. I would suggest cleaning the timing first, and determine if the problem still persists.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No improvement with clean timing result.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Since this problem can replicate with internal loopback, can you please change the pin assignment as below and determine if there is channel dependency?
set_location_assignment PIN_AF25 -to "SFP_RXD0(n)"
set_location_assignment PIN_AG27 -to "SFP_TXD0(n)"
set_location_assignment PIN_AF26 -to SFP_RXD0
set_location_assignment PIN_AG28 -to SFP_TXD0
Besides, please refer to the attached screenshot, add the interface signals of native PHY to signal tap, and then compare the "tx_parallel_data" and "rx_parallel_data" to determine if there is a mismatch. The data drop may happen before this module.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We always test the data at these points. We usually call them xgmii interface. Only difference is a logical conversion of words from big endian to little endian between Native PYH parallel and xgmii. The data at xgmii shows OK.
I will add "tx_parallel_data" and "rx_parallel_data" into my signaltap to see if there is any difference.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please do let me know if more help is needed here. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes. We are still struggling with this thermal issue. We have checked all VCC power supplies to the Cyclone 10. They are all within the specific ranges and there is no big change with the temperature increasing (less than 60mV). We have used Toolkit to test the PMA layer of the transceiver. The result is that there is no bit error under 65C (our problem occurs usually at 64C and below) .
We don't have any obvious clue to fix it now. Any further suggestions are definitely welcome!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For the Bit Error Rate (BER), you probably can play around with the PMA setting. e.g increase the VOD of transmitter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since there is no bit error by using Toolkit, I don't think we need to adjust the PMA settings. Am I right?
I am currently working on Intel's "Low Latency Ethernet 10G MAC Intel® Cyclone® 10 GX FPGA IP
Design Example" trying to implant it into our board to see what it happens.
Do you have any suggestion?
Anyway, thanks a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, it sounds good to use the LL MAC 10G MAC IP example design. If the PMA value is not optimal, you may see the high bit error rate when you vary the temperatures.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In LL MAC 10G MAC IP example design rev19.1 an IOPLL is used to generate 156.MHz and 312.5MHz, while in my design I use an fpll instead following the transceiver design guide line. That is the only difference. The example design works well on my board. So I changed my design to use IOPLL . The result is that it can work at high temperature now without error! It is 73C now but it used to fail at 57C.
It looks it is a big improvement at least.
My question is why is like that?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page