Embedded Connectivity
Intel network controllers, Firmware and drivers.
597 Discussions

I350 and I210-IT Detected Tx Unit Hang, more common as bandwidth and # of Ethernet in use goes up

JohnHentges
Beginner
709 Views

A partner makes SFF PCs.

Our mutual customer needs 4× 1000Base-T Ethernet ports running at max bandwidth.

Our partner specified a 4-port Intel I350 LAN card to satisfy the customer requirement and asked us to validate bandwidth.

Using our partner's SFF PC running the customer's required CentOS7 (fully patched):

We connected three Atom computers with Ubuntu and a test PC running Linux Mint connected to the network interfaces. The connections are mostly stable individually. They will run 50 or 500 Mbit/s for several minutes without problem. When running at full speed they will stall for ten to fifteen seconds and then recover. During that time there are messages "Detected Tx Unit Hang" and then eventually a reset of the interface.

These stalls and resets occur quickly when trying to run all four interfaces at even 50 or 100 Mbit/s.

Example test command: ./run-iperf-clients.sh -b 50M -i 5 -t 60

This is what I can find for the driver right now.
configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=3.2.6-k duplex=full firmware=0.8-4 latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s

We make PCI Express Mini Card Ethernet cards using the Intel I210-IT chip. We connected four of these via a 104-Express carrier board to the SFF PC in lieu of the 4-port Intel I350 LAN card to see if our cards would operate at anything closer to max speed successfully.

Same problems: hangs, slow performance, worse the more cards that are in use.

Also found "AER" errors filling up DMESG, identical to many reports online like here: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected

We also booted to a Mint liveUSB stick and saw the bad performance / hangs / DMESG AER errors even faster than on CentOS7

PCI=NOMSI didn't help.

PCI=NOAER just hides the problem, who wants that in an enterprise-scale production-deploy-intended design? Not a fix.  And the "Detected Tx Unit Hang" and then eventually a reset of the interface would undoubtedly still be occurring, killing our bandwidth.

Open to suggestions.

If an engineer could contact me privately I can provide many log files, part numbers, model numbers, company names, and the quantities this problem is risking etc.

Thank you

0 Kudos
12 Replies
CarlosAM_INTEL
Moderator
693 Views

Hello, @JohnHentges:

Thank you for contacting Intel Embedded Community.

We want to address the following questions to understand this situation:

Could you please let us know if the affected designs are third-party ones or yours?

Could you please inform us of the part numbers and where we can find all the information if they are third-party devices?

Could you please list the sources that you have used to develop the designs associated with this situation and if they were verified by Intel?

Could you please list the part numbers of the Intel processors associated with this situation?

We are waiting for your answer to these questions.

Best regards,

@CarlosAM_INTEL.

JohnHentges
Beginner
672 Views

The PCI Express Mini Card "mPCIe-LAN-Gb" cards are our designs using the Intel I210 chip.

The computer involved is our partner's using the Intel® E3800 Series SoC Processors; DC/Quad

The 4-port I350-equipped LAN card is a 3rd party's.

I can provide all requested / desired information on any private channel.

JohnHentges
Beginner
665 Views

I've received permission from our partner to provide the 4xLAN card page: https://www.adl-usa.com/product/adllan41000e/

and regarding the CPU they state:

The CPU does not have a landing page, but is an i3-6102E COM-express carrier with a custom PCIe/104 carrier that the 4xLAN card plugs into.  The chipset is the Intel Q170. 

and, ... The closest ADL CPU is the ADLQM87PC:

https://www.adl-usa.com/product/adlqm87pc/

The i210 mPCIe card is here: https://accesio.com/?p=/mpcie/mpcie-lan-gbe.html

and the 104 Express to 4x mPCIe slots is here: https://accesio.com/?p=/104e/104e-mpcie-4.html

 

CarlosAM_INTEL
Moderator
661 Views

Hello, @JohnHentges:

Thanks for your clarifications.

Based on your previous messages, could you please let us know the sources that you have used to develop your implementations related to this situation?

On the other hand, could you please confirm the name of the manufacturer of the third-party devices related to the reported condition?

We are waiting for your answer.

Best regards,

@CarlosAM_INTEL.

JohnHentges
Beginner
658 Views

Advanced Digital Logic (adl-usa.com) manufactures the PC involved, and the 4x Port i350-based LAN card.

ACCES I/O Products, Inc. (accesio.com) manufactures the mPCIe-LAN-Gb card using I210 chip, and the 104 Express 4x mPCIe carrier card.

I previously stated, incorrectly, that the I350 4-Port LAN card was 3rd-party; it is made by ADL.

I can provide ACCES schematics and layout files privately if useful.  I may be able to provide the same from ADL if useful, also privately.

 

CarlosAM_INTEL
Moderator
654 Views

Hello, @JohnHentges:

Thanks for your reply.

We suggest sending the affected layout and schematics to be verified by Intel following the procedure stated on the following website:

https://edc.intel.com/Tools/Design-Review/Default.aspx

Best regards,

@CarlosAM_INTEL.

JohnHentges
Beginner
646 Views

All of the designs are mature.

An additional piece of information: everything works great, full bandwidth no reported issues, when we run the iperf and other tests from a Windows installation on the same system/hardware.

We strongly suspect an issue in the Linux driver; whether the issue is in the Intel I210/I350 driver (igp) or in the Linux PCIe bus driver remains to be seen.

Any advice you can provide would be appreciated.

 

CarlosAM_INTEL
Moderator
643 Views

Hello, @JohnHentges:

Thanks for your reply.

Based on your last communication, could you please clarify where do you obtain the mentioned driver? Also, could you please let us know the Linux version and flavor related to this situation?

We are waiting for your answer to these questions.

Best regards,

@CarlosAM_INTEL

JohnHentges
Beginner
640 Views

We've tested Mint 20.04, CentOS7, CentOS8, and Windows 10x64. 

The driver tested (igb) was the driver provided with the Linux distributions in question, although in CentOS7 we also tried installing the driver from the intel driver download portal only to discover the same version was already installed.

CarlosAM_INTEL
Moderator
636 Views

Hello, @JohnHentges:

Thanks for your update.

Based on the provided information, could you please let us know the driver versions related to the reported situation and where you download the one from Intel?

We are waiting for your reply.

Best regards,

@CarlosAM_INTEL.

JohnHentges
Beginner
632 Views

5.5.2, I went on the Intel Design and Resource Center and found an i350-specific Linux driver.

CarlosAM_INTEL
Moderator
629 Views

Hello, @JohnHentges:

Thanks for your update.

Reviewing the readme file of the cited driver, it is stated that you should address your consultations of the cited driver to the email address: e1000-devel@lists.sf.net. You can find the cited document on the following website:

https://downloadmirror.intel.com/15817/eng/readme.txt

Best regards,

@CarlosAM_INTEL.

Reply