- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ProLiant DL380 Gen10
VMware ESXi 7.0.2
vmnic10 0000:b0:00.0 i40en Up Up 10000 Full d4:f5:ef:19:28:90 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic11 0000:b0:00.1 i40en Up Up 10000 Full d4:f5:ef:19:28:98 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic7 0000:37:00.1 i40en Up Up 10000 Full d4:f5:ef:16:7d:68 9000 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic8 0000:13:00.0 i40en Up Up 10000 Full d4:f5:ef:18:bd:60 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic9 0000:13:00.1 i40en Up Up 10000 Full d4:f5:ef:18:bd:68 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
esxcli network nic get -n vmnic9
Advertised Auto Negotiation: true
Advertised Link Modes: Auto, 1000BaseSR/Full, 10000BaseSR/Full
Auto Negotiation: true
Cable Type: FIBRE
Current Message Level: 0
Driver Info:
Bus Info: 0000:13:00:1
Driver: i40en
Firmware Version: 10.51.5
Version: 1.10.9.0
Link Detected: true
Link Status: Up
Name: vmnic9
PHYAddress: 0
Pause Autonegotiate: false
Pause RX: false
Pause TX: false
Supported Ports: FIBRE
Supports Auto Negotiation: true
Supports Pause: true
Supports Wakeon: false
Transceiver:
Virtual Address: 00:50:56:51:ab:53
Wakeon: None
One of the hosts just suddenly went off the network and VMware is blaming it on these errors spewing out of the i40en driver:
vmkernel.log:2021-06-15T16:16:21.805Z cpu54:2097508)i40en: indrv_AllocMultiQueue:165: Failed to allocate Rx
I'm see hundreds if not thousands daily, has anyone else seen these? I have 8 Proliant servers, and all 8 of them are spewing the same error into the vmkernel log.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HPE is stating that this is the fix:
https://kb.vmware.com/s/article/83243
but all of the interfaces are 10Gb, not 25Gb as it states in the article. We're also on 7.0.2, and the KB states the issue was "fixed" in 7.0U1
netPagePoolLimitPerGB -v 15360 (from the article) versus the current default number 5120
netPagePoolLimitCap -v 1375920(from the article) versus the current default number 1048576
If I can schedule some time to do a test, I'll post the results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
Thank you for posting in Intel Ethernet Communities.
We understand that you also found a fix for the issue, If you have questions, please let us know.
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the response Michael. We're still working through this, the noted requisites for the problem does not match what we have in our environment. We're not necessarily sure the provided answer will fix the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
Thank you for the update. While waiting for your reply, can you provide the following details for me to check the issue as well?
- Can you share the link of your latest driver?
- Are you using onboard/embedded X710?
- What is the brand and model of your system?
- Other troubleshooting steps that you tried so far?
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Can you share the link of your latest driver? https://support.hpe.com/hpesc/public/swd/detail?swItemId=MTX_4d2addac81bf4876b47f54925d&#tab4 <-- latest OEM driver from HPE, it's the one we have installed.
- Are you using onboard/embedded X710? Yes, there's both.
- What is the brand and model of your system? HPE ProLiant DL380 Gen10
- Other troubleshooting steps that you tried so far? Unfortunately none. These errors were found while trying to track another issue with flapping MACs on Cisco CSRv devices. Even with the CSRs moved to another device and then shut off, these errors persist. Also I am hesitant to do too much because the devices are in production and I cannot take them offline to perform any work/testing without a solid cause. While the errors are there, the management team does not feel there is enough of an issue to warrant taking any of the ESXi hosts offline.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
Thank you for the quick reply. After checking all of the drivers and updates that you provided. Let me asked if you already tried to raise this issue to HP? The network card is embedded on the board and the system builder is the one who validates the OS.
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As mentioned earlier, we did bring this up with HPE and they are still looking at the issue. They did give us a response, but the symptoms and circumstances do not mach.
https://kb.vmware.com/s/article/83243
but all of the interfaces are 10Gb, not 25Gb as it states in the article. We're also on 7.0.2, and the KB states the issue was "fixed" in 7.0U1
netPagePoolLimitPerGB -v 15360 (from the article) versus the current default number 5120
netPagePoolLimitCap -v 1375920(from the article) versus the current default number 1048576
Because of the variance in our circumstance to the depictions in the article our management is hesitant to make any changes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
Thank you for the quick response and we do understand your situation, however the network card is embedded on HP system so they have altered network card. My suggestion is they can also investigate this issue as a new/different error or issue.
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
I hope you enjoyed your weekend. I am just checking if you are now talking to HP for further assistance regarding this issue.
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
We've made the changes as suggested in https://kb.vmware.com/s/article/83243. We're going to monitor the ESXi host for a few days to make sure the issue doesn't come back. My only fear is that management did not want to move all of the services back to the host in case there was an issue during monitoring, so the host isn't necessarily under the same load as it was before.
I'll continue to monitor and let you know if they decide this was enough to allow the fix to go onto all of the systems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
Thank you for the update. I hope everything will get better after trying the recommendations. By the way, since you are now talking to HP for further assistance, would you like us to keep this thread open?
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
I hope this message finds you well. I am just checking if you are now talking with HP regarding the issue and I hope that the system is working fine now.
In case we do not hear from you, we will make a follow up after 3 workings days.
Thank you.
Best regards,
Michael L.
Intel® Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks like the response from HPE is working as intended.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Slesiak,
Thank you so much for the update and we are glad that the recommendation is working. Please continue to coordinate with HP and as for this thread since you are now talking to HP, we will close this inquiry.
If you need further assistance again, please post a new question.
Thank you and stay safe.
Best regards,
Michael L.
Intel® Customer Support Technician
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page