Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
5288 Discussions

i40EN driver spewing Rx errors on VMware ESXi 7.0.2 hosts

Slesiak
Beginner
5,925 Views

ProLiant DL380 Gen10

VMware ESXi 7.0.2

vmnic10  0000:b0:00.0  i40en   Up            Up           10000  Full    d4:f5:ef:19:28:90  1500  Intel(R) Ethernet Controller X710 for 10GbE SFP+

vmnic11  0000:b0:00.1  i40en   Up            Up           10000  Full    d4:f5:ef:19:28:98  1500  Intel(R) Ethernet Controller X710 for 10GbE SFP+

vmnic7   0000:37:00.1  i40en   Up            Up           10000  Full    d4:f5:ef:16:7d:68  9000  Intel(R) Ethernet Controller X710 for 10GbE SFP+

vmnic8   0000:13:00.0  i40en   Up            Up           10000  Full    d4:f5:ef:18:bd:60  1500  Intel(R) Ethernet Controller X710 for 10GbE SFP+

vmnic9   0000:13:00.1  i40en   Up            Up           10000  Full    d4:f5:ef:18:bd:68  1500  Intel(R) Ethernet Controller X710 for 10GbE SFP+

 

esxcli network nic get -n vmnic9

   Advertised Auto Negotiation: true

   Advertised Link Modes: Auto, 1000BaseSR/Full, 10000BaseSR/Full

   Auto Negotiation: true

   Cable Type: FIBRE

   Current Message Level: 0

   Driver Info:

         Bus Info: 0000:13:00:1

         Driver: i40en

         Firmware Version: 10.51.5

         Version: 1.10.9.0

   Link Detected: true

   Link Status: Up

   Name: vmnic9

   PHYAddress: 0

   Pause Autonegotiate: false

   Pause RX: false

   Pause TX: false

   Supported Ports: FIBRE

   Supports Auto Negotiation: true

   Supports Pause: true

   Supports Wakeon: false

   Transceiver:

   Virtual Address: 00:50:56:51:ab:53

   Wakeon: None

 

One of the hosts just suddenly went off the network and VMware is blaming it on these errors spewing out of the i40en driver:

vmkernel.log:2021-06-15T16:16:21.805Z cpu54:2097508)i40en: indrv_AllocMultiQueue:165: Failed to allocate Rx

 

I'm see hundreds if not thousands daily, has anyone else seen these? I have 8 Proliant servers, and all 8 of them are spewing the same error into the vmkernel log.

0 Kudos
14 Replies
Slesiak
Beginner
5,904 Views

HPE is stating that this is the fix:
https://kb.vmware.com/s/article/83243

but all of the interfaces are 10Gb, not 25Gb as it states in the article. We're also on 7.0.2, and the KB states the issue was "fixed" in 7.0U1

netPagePoolLimitPerGB -v 15360 (from the article) versus the current default number 5120
netPagePoolLimitCap -v 1375920(from the article) versus the current default number 1048576

If I can schedule some time to do a test, I'll post the results.

0 Kudos
Mike_Intel
Moderator
5,897 Views

Hello Slesiak,


Thank you for posting in Intel Ethernet Communities. 


We understand that you also found a fix for the issue, If you have questions, please let us know.

In case we do not hear from you, we will make a follow up after 3 workings days.


Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Slesiak
Beginner
5,888 Views

Thanks for the response Michael. We're still working through this, the noted requisites for the problem does not match what we have in our environment. We're not necessarily sure the provided answer will fix the issue.

0 Kudos
Mike_Intel
Moderator
5,868 Views

Hello Slesiak,


Thank you for the update. While waiting for your reply, can you provide the following details for me to check the issue as well?


  1. Can you share the link of your latest driver?
  2. Are you using onboard/embedded X710?
  3. What is the brand and model of your system?
  4. Other troubleshooting steps that you tried so far?


In case we do not hear from you, we will make a follow up after 3 workings days.


Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Slesiak
Beginner
5,857 Views
  1. Can you share the link of your latest driver? https://support.hpe.com/hpesc/public/swd/detail?swItemId=MTX_4d2addac81bf4876b47f54925d&#tab4 <-- latest OEM driver from HPE, it's the one we have installed.
  2. Are you using onboard/embedded X710? Yes, there's both.
  3. What is the brand and model of your system? HPE ProLiant DL380 Gen10
  4. Other troubleshooting steps that you tried so far? Unfortunately none. These errors were found while trying to track another issue with flapping MACs on Cisco CSRv devices. Even with the CSRs moved to another device and then shut off, these errors persist. Also I am hesitant to do too much because the devices are in production and I cannot take them offline to perform any work/testing without a solid cause. While the errors are there, the management team does not feel there is enough of an issue to warrant taking any of the ESXi hosts offline.
0 Kudos
Mike_Intel
Moderator
5,847 Views

Hello Slesiak,


Thank you for the quick reply. After checking all of the drivers and updates that you provided. Let me asked if you already tried to raise this issue to HP? The network card is embedded on the board and the system builder is the one who validates the OS.


In case we do not hear from you, we will make a follow up after 3 workings days.

Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Slesiak
Beginner
5,842 Views

As mentioned earlier, we did bring this up with HPE and they are still looking at the issue. They did give us a response, but the symptoms and circumstances do not mach.

 

https://kb.vmware.com/s/article/83243

but all of the interfaces are 10Gb, not 25Gb as it states in the article. We're also on 7.0.2, and the KB states the issue was "fixed" in 7.0U1

netPagePoolLimitPerGB -v 15360 (from the article) versus the current default number 5120
netPagePoolLimitCap -v 1375920(from the article) versus the current default number 1048576

 

Because of the variance in our circumstance to the depictions in the article our management is hesitant to make any changes.

0 Kudos
Mike_Intel
Moderator
5,836 Views

Hello Slesiak,


Thank you for the quick response and we do understand your situation, however the network card is embedded on HP system so they have altered network card. My suggestion is they can also investigate this issue as a new/different error or issue.


In case we do not hear from you, we will make a follow up after 3 workings days.

Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Mike_Intel
Moderator
5,805 Views

Hello Slesiak,


I hope you enjoyed your weekend. I am just checking if you are now talking to HP for further assistance regarding this issue.


In case we do not hear from you, we will make a follow up after 3 workings days.

Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Slesiak
Beginner
5,795 Views

Hello,

 

We've made the changes as suggested in https://kb.vmware.com/s/article/83243.  We're going to monitor the ESXi host for a few days to make sure the issue doesn't come back. My only fear is that management did not want to move all of the services back to the host in case there was an issue during monitoring, so the host isn't necessarily under the same load as it was before.

I'll continue to monitor and let you know if they decide this was enough to allow the fix to go onto all of the systems.

0 Kudos
Mike_Intel
Moderator
5,790 Views

Hello Slesiak,


Thank you for the update. I hope everything will get better after trying the recommendations. By the way, since you are now talking to HP for further assistance, would you like us to keep this thread open?


In case we do not hear from you, we will make a follow up after 3 workings days.

Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Mike_Intel
Moderator
5,760 Views

Hello Slesiak,


I hope this message finds you well. I am just checking if you are now talking with HP regarding the issue and I hope that the system is working fine now.


In case we do not hear from you, we will make a follow up after 3 workings days.

Thank you.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Slesiak
Beginner
5,755 Views

It looks like the response from HPE is working as intended.

0 Kudos
Mike_Intel
Moderator
5,750 Views

Hello Slesiak,


Thank you so much for the update and we are glad that the recommendation is working. Please continue to coordinate with HP and as for this thread since you are now talking to HP, we will close this inquiry.


If you need further assistance again, please post a new question. 


Thank you and stay safe.


Best regards,

Michael L.

Intel® Customer Support Technician


0 Kudos
Reply