Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4810 Discussions

XL710 NVM v5.0.5 - TX driver issue detected, PF reset issued & I40E_ERR_ADMIN_QUEUE_ERROR

DHeki
Beginner
2,433 Views

I've been having issues with TX driver issue detected, PF reset issued and fail to add cloud filter for quite some time across 12 VMware ESXi v6.0 hosts. About once a week the result is a purple screen of death (PSOD).

I recently upgraded the XL710 to NVM Firmware v5.0.5 and the VMware ESXi XL710 driver to the latest v2.0.6 on 4 of the 12 and the issues persist.

# ethtool -i vmnic2

driver: i40e

version: 2.0.6

firmware-version: 5.05 0x800028a6 1.1568.0

bus-info: 0000:03:00.0

Q. In trying to identify the culprit of , how do I identify the VM by filter_id?

Q. What is causing the "TX driver issue detected, PF reset issued"?

Q. How can I further troubleshoot to resolve the issue?

Here is just a snippet of /var/log/vmkernel.log. The logs are filled with the same repeating error messages:

Note the frequency of the error messages (~50 per minute!)

2017-05-26T16:01:04.659Z cpu26:32886)<3>i40e 0000:05:00.1: fail to add cloud filter, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EEXIST

2017-05-26T16:01:04.659Z cpu26:32886)<6>i40e 0000:05:00.1: Failed to add cloud filter, err_code = -53, last status = 13, filter_id = 33312, queue = 2

2017-05-26T16:01:04.660Z cpu26:32886)<3>i40e 0000:05:00.1: fail to add cloud filter, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EEXIST

2017-05-26T16:01:04.660Z cpu26:32886)<6>i40e 0000:05:00.1: Failed to add cloud filter, err_code = -53, last status = 13, filter_id = 33312, queue = 2

2017-05-26T16:01:05.347Z cpu11:33354)<6>i40e 0000:05:00.2: TX driver issue detected, PF reset issued

2017-05-26T16:01:05.538Z cpu38:33367)<6>i40e 0000:05:00.2: i40e_open: Registering netqueue ops

2017-05-26T16:01:05.547Z cpu38:33367)IntrCookie: 1915: cookie 0x38 moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.556Z cpu38:33367)IntrCookie: 1915: cookie 0x39 moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.566Z cpu38:33367)IntrCookie: 1915: cookie 0x3a moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.575Z cpu38:33367)IntrCookie: 1915: cookie 0x3b moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.585Z cpu38:33367)IntrCookie: 1915: cookie 0x3c moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.594Z cpu38:33367)IntrCookie: 1915: cookie 0x3d moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.604Z cpu38:33367)IntrCookie: 1915: cookie 0x3e moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.613Z cpu38:33367)IntrCookie: 1915: cookie 0x3f moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 1 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 2 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 3 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 4 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 5 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 6 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 7 not allocated

2017-05-26T16:01:05.660Z cpu26:32886)<6>i40e 0000:05:00.2: Netqueue features supported: QueuePair Latency Dynamic Pre-Emptible

2017-05-26T16:01:05.660Z cpu26:32886)<6>i40e 0000:05:00.2: Supporting next generation VLANMACADDR filter

2017-05-26T16:01:09.659Z cpu21:32886)<3>i40e 0000:05:00.1: fail to add cloud filter, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EEXIST

2017-05-26T16:01:09.659Z cpu21:32886)<6>i40e 0000:05:00.1: Failed to add cloud filter, err_code = -53, last status = 13, filter_id = 33056, queue = 1

2017-05-26T16:01:09.660Z cpu21:32886)<3>i40e 0000:05:00.1: fail to add cloud filter, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EEXIST

2017-05-26T16:01:09.660Z cpu21:32886)<6>i40e 0000:05:00.1: Failed to add cloud filter, err_code = -53, last status = 13, filter_id = 33568, queue = 3

2017-05-26T16:01:14.659Z cpu21:32886)<3>i40e 0000:05:00.1: fail to add cloud filter, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EEXIST

2017-05-26T16:01:14.659Z cpu21:32886)<6>i40e 0000:05:00.1: Failed to add cloud filter, err_code = -53, last status = 13, filter_id = 33312, queue = 2

2017-05-26T16:01:14.660Z cpu21:32886)<3>i40e 0000:0...

0 Kudos
4 Replies
DHeki1
Novice
1,139 Views

Woke up at 3:30AM with a PSOD. XL710 NVM Firmware v5.0.5 and VMware ESXi driver v2.0.6.

Guess I'll try to open a support ticket with Intel Support and see if that gets me anywhere.

0 Kudos
DHeki1
Novice
1,139 Views

For others who are experiencing this issue, support suggested the following:

Following up on your case, we'd like to try disabling the TSO and LRO as we have seen this helps to other users with driver errors reported by the system. Here are the steps:

To disable TSO:

1. Run this command to determine if the hardware TSO is enabled on the host:

esxcli system settings advanced list -o /Net/UseHwTSO

2. Run this command to disable TSO at the host level:

esxcli system settings advanced set -o /Net/UseHwTSO -i 0

3. Run this command to disable TSO6 at the host level:

esxcli system settings advanced set -o /Net/UseHwTSO6 -i 0

To disable LRO:

1. Run this command to determine if LRO is enabled for the VMkernel adapters on the host:

esxcli system settings advanced list -o /Net/TcpipDefLROEnabled

2. Run this command to disable LRO for all VMkernel adatpers on a host:

esxcli system settings advanced set -o /Net/TcpipDefLROEnabled -i 0

Note : The preceding command can only take effect after reboot.

For additional information, please refer to the following VMware document:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2055140 https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2055140

0 Kudos
DHeki1
Novice
1,139 Views

Disabling TSO & LRO did not resolve the issue.

Steps taken:

1. Put host in Maintenance Mode

2. I ran the following commands:

esxcli system settings advanced set -o /Net/UseHwTSO -i 0

esxcli system settings advanced set -o /Net/UseHwTSO6 -i 0

esxcli system settings advanced set -o /Net/TcpipDefLROEnabled -i 0

3. Rebooted

4. Checked the settings and verify TSO/LRO are disabled.

esxcli system settings advanced list -o /Net/UseHwTSO

Path: /Net/UseHwTSO

Type: integer

Int Value: 0

Default Int Value: 1

Min Value: 0

Max Value: 1

String Value:

Default String Value:

Valid Characters:

Description: When non-zero, use pNIC HW TSO offload if available

esxcli system settings advanced list -o /Net/TcpipDefLROEnabled

Path: /Net/TcpipDefLROEnabled

Type: integer

Int Value: 0

Default Int Value: 1

Min Value: 0

Max Value: 1

String Value:

Default String Value:

Valid Characters:

Description: LRO enabled for TCP/IP

5. Exited Maintenance Mode

6, VMotioned a Centos7 VM onto host and continued to see the error messages.

0 Kudos
CSmit29
New Contributor I
1,139 Views

Malicious Driver Detection (MDD) Event - Resolved - New 1.7.11 i40en driver

How is this issue being addressed?

The Malicious Driver Detection issue that we are aware of is addressed in the 1.7.11 i40en driver release for ESXi 6.0, ESXi 6.5 and ESXi 6.7.

Available 1.7.11 i40en driver Download Links: For Intel Ethernet 700 Series Network Adapters ( X710, XL710, XXV710, and X722)

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI60-INTEL-I40EN-1711%26productId%3D564 ESXi 6.0 i40en 1.7.11 Driver

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI65-INTEL-I40EN-1711%26productId%3D614 ESXi 6.5 i40en 1.7.11 Driver

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI67-INTEL-I40EN-1711%26productId%3D742 ESXi 6.7 i40en 1.7.11 Driver

For more information please see the blog post: /community/tech/wired/blog/2018/05/23/malicious-driver-detection-mdd-event-resolved Malicious Driver Detection (MDD) Event – Resolved

0 Kudos
Reply