- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i350-T4 NIC
Windows Server 2012 R2
All 4 i350 ports configured as a Windows LBFO Team (switch independent / dynamic load balancing)
Converged networking (HyperV vSwitch bound to the LBFO team, with vNICs configured on the vSwitch for Host OS operations (Management, Cluster/CSV. and Live Migration)
VLAN tagging in use on VM's and vNICs except the vNIC used for management which is 'native'
VMQ enabled on all i350 ports
SR-IOV disabled on all i350 ports
Server 2012 R2 HyperV cluster
Fully patched with update rollups and hotfixes currently available
Drivers 19.3 (latest from intel website)
In the above configuration the destination server blue screens during live migration. I can sometimes get 1 live migration to work, but a second attempt to live migrate a different VM to the same destination host will cause the host to blue screen.
I can reproduce this issue very easily on any host in the cluster. They all have the same behaviour
If i disable VMQ then the issue stops
Also we dont see this issue with thie same hardware and same configuration using Server 2012 (non R2) though i note that the NIC driver is diferent on this Server 2012 (e1r63x64.sys on 2012 as opposed to e1r64x64.sys on 2012 R2)
crashdup analysis always shows the faulting driver as e1r64x64.sys
BugCheck 1E, {ffffffffc0000005, fffff802be6a2550, ffffd000575b3b58, ffffd000575b3360}
*** ERROR: Module load completed but symbols could not be loaded for e1r64x64.sys
Probably caused by : e1r64x64.sys ( e1r64x64+280e7 )
Followup: MachineOwner
---------
18: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff802be6a2550, The address that the exception occurred at
Arg3: ffffd000575b3b58, Parameter 0 of the exception
Arg4: ffffd000575b3360, Parameter 1 of the exception
Debugging Details:
------------------
WRITE_ADDRESS: unable to get nt!MmNonPagedPoolStart
unable to get nt!MmSizeOfNonPagedPoolInBytes
ffffd000575b3360
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.
FAULTING_IP:
nt!ExQueryDepthSList+0
fffff802`be6a2550 8b01 mov eax,dword ptr [rcx]
EXCEPTION_PARAMETER1: ffffd000575b3b58
EXCEPTION_PARAMETER2: ffffd000575b3360
BUGCHECK_STR: 0x1E_c0000005
DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT
PROCESS_NAME: System
CURRENT_IRQL: 0
ANALYSIS_VERSION: 6.3.9600.17237 (debuggers(dbg).140716-0327) amd64fre
EXCEPTION_RECORD: 0000000000000001 -- (.exr 0x1)
Cannot read Exception record @ 0000000000000001
TRAP_FRAME: ffffe800b6200000 -- (.trap 0xffffe800b6200000)
Unable to read trap frame at ffffe800`b6200000
LAST_CONTROL_TRANSFER: from fffff802be7efefb to fffff802be768ca0
STACK_TEXT:
ffffd000`575b2b38 fffff802`be7efefb : 00000000`0000001e ffffffff`c0000005 fffff802`be6a2550 ffffd000`575b3b58 : nt!KeBugCheckEx
ffffd000`575b2b40 fffff802`be779846 : 00000000`00000000 fffff800`35d0c991 ffffe800`b1172d02 ffffd000`575b2e29 : nt!KiFatalFilter+0x1f
ffffd000`575b2b80 fffff802`be757d56 : 00000000`00000000 fffff802`be6e19a6 ffffe000`516d3f90 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0x696
ffffd000`575b2bc0 fffff802`be7701ed : 00000000`00000000 ffffd000`575b2d60 ffffd000`575b3b58 ffffd000`575b2d60 : nt!_C_specific_handler+0x86
ffffd000`575b2c30 fffff802`be6fd3a5 : 00000000`00000001 fffff802`be615000 ffffd000`575b3b00 fffff800`00000000 : nt!RtlpExecuteHandlerForException+0xd
ffffd000`575b2c60 fffff802`be6fc25f : ffffd000`575b3b58 ffffd000`575b3860 ffffd000`575b3b58 ffffe800`b12ee480 : nt!RtlDispatchException+0x1a5
ffffd000`575b3330 fffff802`be7748c2 : 00000000`00000001 fffffa80`1b6de000 ffffe800`b6200000 00000000`00000000 : nt!KiDispatchException+0x61f
ffffd000`575b3a20 fffff802`be772dfe : 00000000`00000011 00000000`00000002 00000000`00000001 fffff802`be8a929a : nt!KiExceptionDispatch+0xc2
ffffd000`575b3c00 fffff802`be6a2550 : fffff800`35d04875 ffffe800`b0f3c870 ffffd000`575b3e00 ffffe000`517cd000 : nt!KiGeneralProtectionFault+0xfe
ffffd000`575b3d98 fffff800`35d04875 : ffffe800`b0f3c870 ffffd000`575b3e00 ffffe000`517cd000 00000000`00000000 : nt!ExQueryDepthSList
ffffd000`575b3da0 fffff800`372520e7 : ffffe000`517ce540 ffffe000`517cd000 ffffe800`b1496c60 00000000`00000000 : NDIS!NdisFreeNetBufferList+0xb5
ffffd000`575b3e20 fffff800`372528a9 : ffffe000`517ce540 ffffe000`517cd000 00000000`00000001 00000000`00000000 : e1r64x64+0x280e7
ffffd000`575b3e50 fffff800`37252c00 : ffffe000`517ce540 00000000`00000001 00000000`00000000 ffffe000`517cd000 : e1r64x64+0x288a9
ffffd000`575b3e90 fffff800`37264a9d : ffffe000`517cd000 ffffe000`00000001 ffffe000`00000001 ffff0001`00000001 : e1r64x64+0x28c00
ffffd000`575b3ec0 fffff800`37261c7b : 00000000`00000000 ffffd000`575469a0 ffffe000`517cd000 00000000`00000000 : e1r64x64+0x3aa9d
ffffd000`575b3f00 fffff800`3725a909 : 00000000`00000002 00000000`00000000 ffffe000`517cd000 ffffd000`575469a0 : e1r64x64+0x37c7b
ffffd000`575b3f50 fffff800`3725b02b : ffffe800`b528cde0 fffff800`35d04671 ffffd000`575b40f0 ffffe000`51105ad0 : e1r64x64+0x30909
ffffd000`575b3fc0 fffff800`35d8f0fa : ffffe800`b5b87868 ffffe800`b5b87858 ffffe800`b5b87854 ffffe800`b0d501a0 : e1r64x64+0x3102b
ffffd000`575b4030 fffff800`35d033a3 : ffffe800`b0d501a0 ffffd000`575b40e9 ffffe800`b5b87820 00000000`00000011 : NDIS!ndisMInvokeOidRequest+0x4e
ffffd000`575b4070 fffff800`35d04324 : 00000000`00000000 ffffe800`b0d501a0 ffffe800`b5b87868 00000000`00000000 : NDIS!ndisMDoOidRequest+0x39b
ffffd000`575b4150 fffff800`35d0475e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : NDIS!ndisQueueOidRequest+0x4c4
ffffd000`575b42f0 fffff800`3679719e : ffffe800`b147b8c0 00000000`00010224 ffffe800`b147b8c0 ffffe000`52bf4010 : NDIS!NdisFOidRequest+0xc2
ffffd000`575b43b0 fffff800`35d038de : ffffe800`b5b87820 ffffe000`51105ad0 00000000`00000000 ffffe000`52bea010 : wfplwfs!LwfLowerOidRequest+0x6e
ffffd000`575b43e0 fffff802`be6e19a6 : ffffd000`575b46d0 ffffd000`575af000 00000000`00000000 00000000`00000000 : NDIS!ndisFDoOidRequestInternal+0x2ee
ffffd000`57...
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanx for posting to our blog site.
I appreciate your frustration. I asked our virtualization guru, he indicated that this issue has been fixed and will be available in the next release of the drivers that I belive is scheduled for Q4 of this year.
Thanx,
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
thanks for the information
could i ask:
is it possible for you to post any technical detail about the issue viz a vis the cause and some more detailed technical information about why the driver is faulting (Private message is fine - i dont indend to republish this information. Its just for my knowledge)
Is it possible for me to get hold of a release candidate of anything prior to official release. This issue is serious and its preventing us from going into production on this and another cluster we are about to build as part of our transition to server 2012 R2
i would be happy to sign NDA or do anything else you might need in that regard. Also happy to feed back my testing results for your use
Failing that, can you be more specific about release date. because we're in Q4 now.... so this could mean anytime between now and January. Thats a very wide time window indeed
Many thanks for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately since this is not an open source OS, the details of issues are not available to the public, as they are for our open source drivers.
The next release is going to happen in the extreme near future. They are doing the absolute final regression testing as I type this. Can't give an exact date, but if I were a betting man (which I'm not), I'd guess in the next week or two.
Hope that helps a bit.
- Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can i double check which driver version this fix went into?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi. This is serving as a bump - i am currently talking to Microsoft support and the kernel debugging team, but i would really like to know if this has made it into a driver version and if so which was the first one? Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear mrrorschach,
Thanks for writing back. I will further check on this.
Sincerely,
Sandy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear mrrorschach,
This issue is fixed with the network card's driver version 20.0.
You may download here - https://downloadcenter.intel.com/product/59062/Intel-Ethernet-Server-Adapter-I350-T2 Intel® Download Center.
Have a great day!
Sincerely,
Sandy
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page