Community
cancel
Showing results for 
Search instead for 
Did you mean: 
YYang79
Beginner
163 Views

Cyclone10GX Temperature Issue

Hi,

I use a cyclone 10 gx fpga (10CX150YF780C5G) on my pcie card.  One transceiver is used as a 10Gbps fiber link and other 4 transceivers are used as Gen1 Pcie X4 lanes. No external memory interface. The total power consumption is about 2W.

The problem is when fpga's die temperature rises above about 55C, the transceiver that is used to connect the fiber link(SFP) has occurred missing word error in transmitting side. I use the transceiver  loopback function to monitor the data both at a remote receiver output and at the local loopback receiver output. They show exactly same fault. It tells us the PCS stage in the transmitter side has something wrong. The reference clock to the transceiver ATX-pll is 644.53125MHz. Other functions in the fpga is working well.

When I lower the temperature by add a fan to blow it, it works normally. Is there any one knowing how to fix it?

Thanks a lot!

 

0 Kudos
14 Replies
SengKok_L_Intel
Moderator
136 Views

Hi,


In order to better understand the problem, do you mean if you enable the internal serial loopback (no using cable), the same problem is observed if the temperature rises above 55C, while the PCIe and other functions are still working well?


How many devices have a similar problem? Could you please try to test on another channel to determine if there is any channel dependency on this particular device?


Regards -SK


YYang79
Beginner
114 Views

Hi,

You are right. Using the internal loopback(from transmitter's PCS output loopback to input of receiver's PCS) at the xgmii interface of the transceiver we get wrong data(mainly losing a 64-bit word in a packet) when operation temperature rises to 55C. Since when you loopback, the data actually is sent to the remote receiver over the fiber link too, so we can see at the remote receiver output that we get the same loss of data as the loopback output.

For the design, we get clean timing analyzer result without any timing violations. 

We have total five boards. They fail at different temperature, the lowest is 55C and the highest is 72C not reaching to the specified 100C. In board design we only use one transceiver as a 10Gbps fiber link. It may not easy to switch to other transceiver in the FPGA to test other channels.

Thanks. Looking forward to your further helps.

SengKok_L_Intel
Moderator
108 Views

Hi,


If enable the serial loopback can see the problem, you should be able to test on other channels as well since it does not use the PHY channel (fiber cable), you can just change the Pin assignment will do.


Do you have a signal tap that can show the pass and fail condition?


Regards -SK


YYang79
Beginner
105 Views

Yes I have.

See attached. It shows the data communication at xgmii interface.

data_tx_rx.png

SengKok_L_Intel
Moderator
94 Views

Hi,


Can you provide me a simplified design that only consists of 1 channel of 10G that can replicate the issue on your hardware, so that I can have a better understanding of what is the setting in the transceiver and 10G MAC IP?


Besides, please ensure your board design has met the Pin Assignment Guideline, especially the power supply, and transceiver pins.

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/dp/cyclone-10/pcg-01022.pdf


YYang79
Beginner
86 Views

Hi,

The board works well under the temperature of 55C, so i think the pcb design should be ok and pins' assignment does not look having any problem so far. 

For your further review, I can archive my fpga design and email it to you instead of post here. Please give me an email address. Does it sound right?

Thanks a lot for your helps.

SengKok_L_Intel
Moderator
79 Views

Please refer to the following link (table 1), and check the GXB power supply (e.g. Vcct_GXB, Vccr_GXB, and Vcch_GXB), and determine if there is any difference between a pass and fail case. The FPGA is supposed to work at above 55C, and since you encounter it on multiple boards, so it is suspicious if there is a board issue.


https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/cyclone-10/c10gx-51002.p...


Besides, you can send me a private message with the simple design, so that I can have a sanity check on it. Thanks.


YYang79
Beginner
77 Views

How to send you a private message? 

SengKok_L_Intel
Moderator
43 Views

You can click on the Message icon (top right), and then click compose to create new message, from the “send to”, you can find my name over there.

 

Regards -SK


YYang79
Beginner
34 Views

I have checked the GXB powers. They are all within the range on table1, and unchanged with the temperature variation.

SengKok_L_Intel
Moderator
27 Views

I found that there is a setup timing violation in your design. I would suggest cleaning the timing first, and determine if the problem still persists.


YYang79
Beginner
24 Views

No improvement with clean timing result.

SengKok_L_Intel
Moderator
17 Views

Hi,

 

Since this problem can replicate with internal loopback, can you please change the pin assignment as below and determine if there is channel dependency?

 

set_location_assignment PIN_AF25 -to "SFP_RXD0(n)"

set_location_assignment PIN_AG27 -to "SFP_TXD0(n)"

set_location_assignment PIN_AF26 -to SFP_RXD0

set_location_assignment PIN_AG28 -to SFP_TXD0

 

Besides, please refer to the attached screenshot, add the interface signals of native PHY to signal tap, and then compare the "tx_parallel_data" and "rx_parallel_data" to determine if there is a mismatch. The data drop may happen before this module.

 

 

 

 

YYang79
Beginner
10 Views

Hi,

We always test the data at these points. We usually call them xgmii interface. Only difference is a logical conversion of words from big endian to little endian between Native PYH parallel and xgmii. The data at xgmii shows OK.

I will add "tx_parallel_data" and "rx_parallel_data" into my signaltap to see if there is any difference.