Embedded Connectivity
Intel network controllers, Firmware, and drivers support systems
845 Discussions

I211 Reliability Issue During Cold Reset

AMo00
Beginner
1,314 Views

Hi,

 

Approximately 10% of our boards (6/63) have issues with the Ethernet initialization (i211 chip#WGI211ATSLJXZ).

90% of our PCB's seem to have no issues. The remaining 10% have an individual failure rate for cold resets (room temperature failure rates: 55% 33% 32% 31% 9% 7%).

 

If the initialization goes well, Linux can be re-booted with Ethernet communication always working. If the i211 initialization however fails, upon the next reboot the Linux boot hangs. This has been the case for three different Linux Kernel versions.

 

We have tested several different Linux Kernel versions, and with "hundreds" of software modifications without any help. We have not been able to fix this in software. When we see a failure the chip "seems" completely dead.

 

See attached document for more details.

 

Do you have any suggestions on what the root-cause for this failure is?

 

Could you help us to debug the issue with JTAG?

 

Have you experienced any similar issues before (and found the solution)?

 

Thank you in advance!

 

Best regards,

Alexander.

0 Kudos
7 Replies
Mike_Intel
Moderator
843 Views
Hello AMo00, Thank you for posting in Intel Ethernet Communities. Before we proceed, let me clarify the set up or the testing that you are doing with the network card. Please provide the following details: 1. Are you designing a new board and you are using the network card as on-board network card? 2. Do you need help on how to implement the network card on your PCB or mainboard? If you have questions, please let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
0 Kudos
AMo00
Beginner
843 Views

Hi Michael,

 

  1. Yes. We have designed a board with two Ethernet ports. For one of them we use your i211 chip. It is connected over PCIe to our Frescale IMX6Q CPU. I have compared our design to your schematic and layout design guidelines. I haven't found any noticable differencies, and most of the things I found I have already tested without any luck.
  2. I need help to find/solve the issue. As I wrote, more than 90% of our boards have no issues whatsoever, but the remaining 10% have i211 initialization issues. Boards with issues fails 7-55% of the time during power-cycles in room-temperature. When we swapped the i211 chip between a "known-good" PCB and an "issue-PCB" the failure followed the i211 chip. We have tried many software and hardware changes, but none have helped so far.

 

Best regards,

Alexander.

0 Kudos
Mike_Intel
Moderator
843 Views
Hello AMo00, Thank you for the providing the information that I requested. Let me check this and get back to you. If you have questions, please let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
0 Kudos
AMo00
Beginner
843 Views

Hi,

 

It has now gone 12 days.

 

We have now made two swaps of i211 between known-good and issue boards. For both swaps the issue followed the i211 chip - The good boards turned bad and the bad boards turned good.

We still havn't found the actual root-cause for this issue.

 

On one board, that had a failure rate of 31%, we were able to get this down to 8.7% by replacing the xtal. The replacement xtal had a load of 16pF. When we used 10pF cap loads (!) instead of 27pF for this new xtal we reduced the issue all the way down to 1%.

I can only think of this beeing because of a higher voltage swing on the input?

 

  1. Does the input swing values for external oscillators also apply to xtals (200mV max low and 1400mV min high)?
  2. If yes on 1. -> Is this value for before 50 ms or after? After 50 ms the swing is greatly reduced.

On one issue board we measured the following with a 3.9pF probe:

10ms – 50ms 10 mV - 1600mV

50ms - 320mV - 1080mV

 

Scope pic (Undersampled, just for voltage levels)

 

I used a spectrum analyzer with SPF=2KHz and RBW=30Hz to find the frequency on another issue board:

Xtal: 25 000 650 Hz

PCIe clk: 99 998 900 Hz

 

I'm now replacing the xtal once again, trying one with 10pF Cload to get a higher swing at the input.

 

In total we have so far tried 3 different xtals, with 20pF, 18pF and 16pF Cloads. All within spec (you recommend 16pF)

The issue seems to have gotten lower with the lower Cloads, but we haven't found any combination removing the issue altogether.

 

I have been through the design guidelines, both the schematic and layout. I have tried so to speak all the small differencies found in the schematic guidelines without any luck.

 

Best regards,

Alexander.

 

0 Kudos
CarlosAM_INTEL
Moderator
843 Views

Hello, @AMo00​:

 

Thank you for contacting Intel Embedded Community.

 

In order to help you we will contact you via email.

 

Best regards,

@Mæcenas_INTEL​.  ​

0 Kudos
AMo00
Beginner
843 Views

Hi,

 

This thread can be locked since we finally found the solution.

 

The PCIe reference clock was out of spec. Adding 470 Ohm pull-up and 56 Ohm pull-down resistors to CLKp & CLKn solved the issue.

 

To anyone making a design based on IMX6; add PU and PD resistors on the reference clock in addition to the 0.1uF capacitors no matter what the reference design is doing. It's needed to get the correct common-mode voltages and to make it work for "any" perpherial using PCIe.

 

Best regards,

Alexander.

0 Kudos
CarlosAM_INTEL
Moderator
843 Views

​Hello, @AMo00​:

 

Thanks for share this useful information.

 

Best regards,

@Mæcenas_INTEL​. 

0 Kudos
Reply