Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4813 Discussions

Intel X710-DA4 / VMware ESXi 6.5u1 - Malicious Driver Detection Event Occured-can't even get a VM to boot

DHeki1
Novice
3,687 Views

I can't even get a VM to boot when I use the i40en driver v1.3.1 under ESX v6.0u2. As soon as I power on a VM the system crashes with Malicious Driver Detection and all traffic stops.

I've had to fall back to using the i40e v2.0.6 (https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI60-INTEL-I40E-206&productId=491 https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI60-INTEL-I40E-206&productId=491).

Just as you said, with v1.3.1 any decent amount of network traffic can trigger this issue which stops ALL network traffic and requires a reboot.

2017-08-11T23:59:52.735Z cpu38:33417)i40en: i40en_HandleMddEvent:6484: Malicious Driver Detection event 0x01 on TX queue 1 PF number 0x02 VF number 0x1e

2017-08-11T23:59:52.735Z cpu38:33417)i40en: i40en_HandleMddEvent:6510: TX driver issue detected, PF reset issued

2017-08-12T00:00:00.235Z cpu38:33417)i40en: i40en_HandleMddEvent:6484: Malicious Driver Detection event 0x02 on TX queue 0 PF number 0x02 VF number 0x00

2017-08-12T00:00:00.235Z cpu38:33417)i40en: i40en_HandleMddEvent:6510: TX driver issue detected, PF reset issued

With v2.0.6, traffic hiccups but keeps flowing as soon as the driver resets (>1 sec) which usually doesn't cause an issue. This usually occurs about 100x a day across my 8 node VMware Cluster.

I do have occasions where the (TX driver issue detected, PF reset issued) occurs continuously and then it ends up causing an outage.

2017-05-26T16:01:05.347Z cpu11:33354)<6>i40e 0000:05:00.2: TX driver issue detected, PF reset issued

2017-05-26T16:01:05.538Z cpu38:33367)<6>i40e 0000:05:00.2: i40e_open: Registering netqueue ops

2017-05-26T16:01:05.547Z cpu38:33367)IntrCookie: 1915: cookie 0x38 moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.556Z cpu38:33367)IntrCookie: 1915: cookie 0x39 moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.566Z cpu38:33367)IntrCookie: 1915: cookie 0x3a moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.575Z cpu38:33367)IntrCookie: 1915: cookie 0x3b moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.585Z cpu38:33367)IntrCookie: 1915: cookie 0x3c moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.594Z cpu38:33367)IntrCookie: 1915: cookie 0x3d moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.604Z cpu38:33367)IntrCookie: 1915: cookie 0x3e moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.613Z cpu38:33367)IntrCookie: 1915: cookie 0x3f moduleID 4111 exclusive, flags 0x25

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 1 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 2 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 3 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 4 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 5 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 6 not allocated

2017-05-26T16:01:05.659Z cpu26:32886)<6>i40e 0000:05:00.2: Tx netqueue 7 not allocated

2017-05-26T16:01:05.660Z cpu26:32886)<6>i40e 0000:05:00.2: Netqueue features supported: QueuePair Latency Dynamic Pre-Emptible

2017-05-26T16:01:05.660Z cpu26:32886)<6>i40e 0000:05:00.2: Supporting next generation VLANMACADDR filter

Intel Support has not been helpful in resolving these issues. They suggested disabling TSO/LRO but that didn't make a noticeable difference.

Maybe one day Intel will take the VMware i40e/i40en driver issues seriously and attempt to fix them. I've been dealing with this for 2+ years with no end insight.

0 Kudos
1 Solution
CSmit29
New Contributor I
1,827 Views

Malicious Driver Detection (MDD) Event - Resolved - New 1.7.11 i40en driver

How is this issue being addressed?

The Malicious Driver Detection issue that we are aware of is addressed in the 1.7.11 i40en driver release for ESXi 6.0, ESXi 6.5 and ESXi 6.7.

Available 1.7.11 i40en driver Download Links: For Intel Ethernet 700 Series Network Adapters ( X710, XL710, XXV710, and X722)

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI60-INTEL-I40EN-1711%26productId%3D564 ESXi 6.0 i40en 1.7.11 Driver

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI65-INTEL-I40EN-1711%26productId%3D614 ESXi 6.5 i40en 1.7.11 Driver

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI67-INTEL-I40EN-1711%26productId%3D742 ESXi 6.7 i40en 1.7.11 Driver

For more information please see the blog post: /community/tech/wired/blog/2018/05/23/malicious-driver-detection-mdd-event-resolved Malicious Driver Detection (MDD) Event – Resolved

View solution in original post

0 Kudos
7 Replies
idata
Employee
1,827 Views

Hi DHekimian,

 

 

Thank you for posting at Wired Communities. Can you share what is the exact model of your networkd adapter, is it the same X710-DA4? What is the firmware version for the network adapter? The malicious driver detection is a feature supported by the NIC, please refer to the information at http://www.intel.com/content/dam/www/public/us/en/documents/release-notes/xl710-ethernet-controller-feature-matrix.pdf. This feature monitors queues and VFs for malformed descriptors that might indicate a malicious or buggy driver.

 

 

Thanks,

 

sharon

 

 

 

0 Kudos
DHeki1
Novice
1,827 Views

I have both the Intel(R) Ethernet Converged Network Adapter XL710-Q1 & Intel(R) Ethernet Converged Network Adapter XL710-Q2.

If "This feature monitors queues and VFs for malformed descriptors that might indicate a malicious or buggy driver." then you are confirming that its a malicious or buggy driver?

Segm:Bu:De.F Vend:Dvid Subv:Subd

0000:03:00.0 8086:1583 8086:0002

esxcli network nic get -n vmnic2

Advertised Auto Negotiation: false

Advertised Link Modes: 10000baseT/Full

Auto Negotiation: false

Cable Type: DA

Current Message Level: -1

Driver Info:

Bus Info: 0000:03:00:0

Driver: i40en

Firmware Version: 5.05 0x800028a6 1.1568.0

Version: 1.3.1

Link Detected: true

Link Status: Up

Name: vmnic2

PHYAddress: 0

Pause Autonegotiate: false

Pause RX: false

Pause TX: false

Supported Ports: DA

Supports Auto Negotiation: false

Supports Pause: true

Supports Wakeon: false

Transceiver:

Wakeon: None

CKalb
Novice
1,827 Views

Thank you for your comment DHekimian. I'm currently trying to escalate with VMware support, who have been pointing me in various directions, just as long as it's not their own engineering team. We have used the 2.0.6 driver before with ESX 6.5, but that resulted in frequent PSODs. Even without the PSODs, we cannot really afford any traffic hiccups in our environment. Have you had this issue with a firmware version other than 5.05?

0 Kudos
DHeki1
Novice
1,827 Views

I've had PSOD's and NIC PF peset issues with all the NVM Firmware versions & Drivers I've tried for the past 2 years.

NVM / i40e Driver Versions I've tried.

4.42 / 1.2.48

4.53 / 1.3.38 & 1.3.45

5.02 / 1.4.26

5.04 / 1.4.28

5.05 / 2.0.6

5.05 / 1.31 (i40en)

At first Intel Engineering said many of my issues were known and kept delaying me until NVM 5.02 / 1.4.26 which they expected would resolve them. That release at least made the cards someone stable but the PSOD's and NIC PF resets still happen too frequently (PSOD's occur at least once a week across one of my 12 hosts).

Most of the people I'd been in contact with at Intel Engineering during this process no longer work there or respond to email. Lately Intel Support keeps sending me in a circle by telling me to contact my Hardware Vendor and/or VMware for support. I bought the cards via retail so there's no Hardware Vendor support and VMware has sent me to Intel.

0 Kudos
idata
Employee
1,827 Views

Hi DHekimian,

 

 

Thank you for sharing the information and I am sorry to hear what happen. We will have to further check. By the way, I also sent a private message to you. Please reply from there.

 

 

Thanks,

 

sharon

 

0 Kudos
idata
Employee
1,827 Views

Hi DHekimian,

 

 

I sent a private message to you. Thanks.

 

 

Regards,

 

sharon

 

0 Kudos
CSmit29
New Contributor I
1,828 Views

Malicious Driver Detection (MDD) Event - Resolved - New 1.7.11 i40en driver

How is this issue being addressed?

The Malicious Driver Detection issue that we are aware of is addressed in the 1.7.11 i40en driver release for ESXi 6.0, ESXi 6.5 and ESXi 6.7.

Available 1.7.11 i40en driver Download Links: For Intel Ethernet 700 Series Network Adapters ( X710, XL710, XXV710, and X722)

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI60-INTEL-I40EN-1711%26productId%3D564 ESXi 6.0 i40en 1.7.11 Driver

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI65-INTEL-I40EN-1711%26productId%3D614 ESXi 6.5 i40en 1.7.11 Driver

https://my.vmware.com/web/vmware/details%3FdownloadGroup%3DDT-ESXI67-INTEL-I40EN-1711%26productId%3D742 ESXi 6.7 i40en 1.7.11 Driver

For more information please see the blog post: /community/tech/wired/blog/2018/05/23/malicious-driver-detection-mdd-event-resolved Malicious Driver Detection (MDD) Event – Resolved

0 Kudos
Reply