Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4808 Discussions

Re: Intel X710 vs VMWare ESX: crash and reboot-ESCI lgos suddenly stop

as20
Beginner
2,318 Views

Hello, when you were running into this issue, where the ESXI logs suddenly stopped or did they provide any information? I may have a similar set up, but my ESXI logs go dark immediately with no helpful information.

0 Kudos
4 Replies
idata
Employee
911 Views

Hi Browniecake,

 

 

Thank you for posting in Wired Communities. Can you share more information about your setup? OS version, NIC model, driver version and other relevant information.

 

 

Thanks,

 

Sharon

 

0 Kudos
as20
Beginner
911 Views

I've got nothing in the logs that show anything indicative of why my host keeps rebooting.

1x Dell R640 running ESXI 6.5 CPU is 2x 6138 GOLD. 768GB RAM. Host Seems to reboot during any View Horizon Suite linked clone activity, (vMotion, Provisioning, Cloning, rebooting.)

* Firmware Inventory**

* ESXI version 5969303

* Component FW Version

* Power Supply.Slot.1 00.24.7D

* Power Supply.Slot.2 00.24.7D

* Integrated Remote Access Controller 3.00.00.00

* Lifecycle Controller 3.00.00.00

* Dell 64 Bit uEFI Diagnostics, version 4301, 4301X09, 4301.10 4301X09

* Dell OS Driver Pack, 17.05.21, A00 17.05.21

* OS COLLECTOR, 3.0, A00 3.0

* iDRAC Service Module Installer, 3.0.1, A00 3.0.1

* System CPLD 1.0.1

* Identity Module 1.02

* Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:29:24:C2 18.0.16

* Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:29:24:C0 18.0.16

* Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:27:A7:C2 18.0.16

* Intel(R) Ethernet 10G X710 rNDC - 24:6E:96:76:30:22 18.0.16

* Intel(R) Ethernet 10G X710 rNDC - 24:6E:96:76:30:26 18.0.16

* Intel(R) Ethernet 10G X710 rNDC - 24:6E:96:76:30:24 18.0.16

* Intel(R) Ethernet 10G 4P X710 SFP+ rNDC - 24:6E:96:76:30:20 18.0.16

* Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:27:A7:C0 18.0.16

* BIOS 1.1.7

* Dell HBA330 Mini 13.17.03.00

This is all the ESXI logs show, everything appears to be different for every reboot. It looks like the first thing the kernel logs during a boot is "VMB: 112: mbMagic: 2badb002, mbInfo 0x101688" so I am copying and pasting the last log entry of the vmkernel before the reboot.

1st reboot:

2017-09-29T09:07:47.463Z cpu25:67940 opID=cb5d515b)FDS: 586: Enabling IO coalescing on driver 'deltadisks' device '143a14-rer0970_2-checkpoint-digest-sesparse.vmdk'

VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

2nd reboot:

2017-09-28T20:17:06.483Z cpu14:79323)BC: 5028: Failed to flush 1 buffers of size 8192 each for object 'vmware.log' f530 28 3

59ca35aa 20549898 6e246686 60187696 2cc04984 652 0 0 0 0 0: No connection

2017-09-28T20:17:06.484Z cpu14:79323)WARNING: BC: 6285: failed to flush buffer cache pool 3

2017-09-28T20:17:06.484Z cpu14:79323)WARNING: UserFile: 1856: Error forcing buffered writes to disk: No connection

VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

3rd reboot:

2017-09-28T18:29:32.302Z cpu34:101276)Deactivating Daemon ESXShell.

2017-09-28T18:29:32.705Z cpu9:101276)Daemon ESXShell deactivated.

VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

4th reboot:

2017-09-28T10:50:46.059Z cpu24:105395)Swap: vm 105376: 5175: Finish swapping in migration swap file. (faulted 0 pages,

pshared 0 pages). Success.

2017-09-28T10:50:46.209Z cpu78:68721)Config: 706: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)

2017-09-28T11:57:56.702Z cpu29:66139)BC: 3571: Pool 2: Blocking due to no free buffers. nDirty = 26 nWaiters = 1

VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

Happy to provide any additional info. Checked iDRAC but the only entry about the reboot says "System CPU Resetting." There are no cooling, thermal, or health warnings.

0 Kudos
idata
Employee
911 Views

Hi Browniecake,

 

 

Thank you for the information, just to clarify so you mean with the presence of X710 , the host will reboot? Is there any chance to remove the NIC to isolate the issue?

 

 

Thanks,

 

Sharon

 

 

 

0 Kudos
as20
Beginner
911 Views

this has confirmed to be a power issue. The UPS was overloaded and causing this host to reboot unexpectedly.

0 Kudos
Reply