Community
cancel
Showing results for 
Search instead for 
Did you mean: 
sone0
New Contributor I
698 Views

Host system hangs and becomes unresponsive when guest VM tries to initialize SR-IOV driver with 82599ES NIC

Jump to solution

I am running Arch Linux version 5.0.5 with Intel ixgbe Inbox kernel driver version 5.1.0

 

SR-IOV is enabled in BIOS, amd_iommu is enabled in kernel, maxvfs is set to 1 (so 1 VF per port, total of two) and the ixgbevf driver is blacklisted and vfio-pci driver is assigned to handle the VFs instead.

 

Every time I try to boot an OS in a VM with a VF passed through in KVM, the host system hangs and does not respond to any input or ssh and needs to be reset. I have tried with Ubuntu 18.04, Fedora 29 and Windows 10 1809 as guest VM OSes. Seems like an ixgbe driver problem to me.

 

Any thoughts?

0 Kudos

Accepted Solutions
sone0
New Contributor I
111 Views

Hello Mike! It appears that some Linux update has fixed my problems!

 

I tried and verified with:

 

Host: Arch Linux kernel version 5.1.4, ixgbe Inbox version 5.1.0-k

 

Guest.1: Fedora 29 VM kernel version 5.0.17, ixgbevf Inbox version 4.1.0

Guest.2: Fedora 29 VM kernel version 4.18.xx, ixgbevf Inbox version 5.1.0

Guest.3: Windows 10 VM, Intel driver version 23.5.2 for Windows Server 2016/2019

 

Everything seems to be working properly although Windows performance might be lower than expected. I will investigate further but I have had no lock-ups or other weird behavior.

View solution in original post

33 Replies
Michael_L_Intel2
Moderator
96 Views
Hello sone0, Thank you for posting in Intel Ethernet Communities. For me to further check or understand the issue, let me gather the following information. 1. Are you using a desktop or server system? Please provide the model of your board or system. 2. Since you tried different guest Operating system, let me check if you tried a different NIC card and is it giving the same issue? 3. Please provide the markings of your Network card for me to check if the card is OEM or retail. If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
sone0
New Contributor I
96 Views

Hello Mike, thank you for the rapid reply!

I am using a custom system with an AsRock X399 Taichi motherboard and an AMD Threadripper 2920x CPU, both of which support SR-IOV.

I do not have a second Intel NIC, I have tried with a Mellanox ConnectX-3 10Gb NIC and I was able to successfully pass through a VF to a Linux guest and verify connectivity.​ So SR-IOV is functional on my platform, at least for the Mellanox card.

My card is a Fujitsu D2755 (Intel 82599ES based) NIC.

I found other users reporting the same problem with Linux hosts with kernels >= 4.20. They ​have reported that it used to work with kernels up to 4.19 but exhibit the same behavior with 4.20 or newer. One particular user also compiled the 5.5.5 ixgbe out of tree driver from Intel and had the same problem with kernels 4.20 or newer.

Michael_L_Intel2
Moderator
96 Views
Hello sone0, By the way, since your Network card is from Fujitsu, have you tried contacting them regarding the issue? Even if the card is Intel based, they may have altered some specification of the card. If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
sone0
New Contributor I
96 Views

I haven't contacted Fujitsu as this behavior has been exhibited by differently branded NICs and models, including the Intel X552 onboard NIC. Everything seems to point to a driver issue.

XFei01
Beginner
96 Views

I can confirm this. I have two hosts, one with 82599es , another with X552, both use ixgbe driver. VMs with VF passed through cannot boot. On one host, when starting the VM, there is no ovmf logo, only a black screen. On the other host, the host locks up as soon as the virtual machine is started.

 

The problem starts after the 4.20 kernel update. If I reverted to any pre 4.20 kernel, sr-iov works very well

 

My nics are not the same brand as sone0's,so I think there is something wrong with the ixgbe driver.

Michael_L_Intel2
Moderator
96 Views
Hello sone0, The reason why we asked if you tried contacting Fujitsu is if they altered some specifications on this card, they may have an updated driver for fixes. Intel provide generic drivers, you may try them but for cards that have been altered or modified. Generic drivers may not be enough to fix some issues, By the way, have you tried different driver versions aside from 5.1.0? If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
sone0
New Contributor I
96 Views

Hello Mike, I will get in contact with Fujitsu support but as XFei01 ​mentioned above, this happens with other brands of NICs as well. I have tried 5.5.5 Out of tree driver and still experience the same problem. Please take the time and try to replicate it to see what happens. Thanks again

Michael_L_Intel2
Moderator
96 Views
Hello sone0, Don't worry, we are also looking in to this issue. I am also checking our database for related issues using Retail Intel Network cards. Please do update us as well after talking to Fujitsu. If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
sone0
New Contributor I
96 Views

Hi again MIke,

 

I contacted Fujitsu and there is no specific driver for the card, they just pointed me to the Administrative Tools for Intel® Network Adapters software download page.

 

Michael_L_Intel2
Moderator
96 Views
Hello sone0, Thank you for the update. By the way, is it working before? Have you also tested it using a different host OS? If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
Michael_L_Intel2
Moderator
96 Views
Hello sone0, Just want to check if you tried a different host OS before to isolate the issue. If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
Michael_L_Intel2
Moderator
96 Views
Hello sone0, I just want to check if you tried testing it with a different host OS? If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
sone0
New Contributor I
96 Views

Hello Mike, sorry for my lack of replies, I was very busy.

 

I am starting testing right now with Fedora 29 host, later on I will move to Ubuntu and I will report my findings in 1 - 2 hours.

sone0
New Contributor I
96 Views

Hey again Mike, I managed to do some testing with Fedora 29 as host OS and kernels 4.18, 4.19, 4.20 and 5.0.7.

 

Kernels 4.18 and 4.19 worked as expected and the Ubuntu 18.04 VM had a Virtual Interface NIC and full access to the internet.

 

Kernels 4.20 and 5.0.7 exhibited the problem, with the host completely locking up when the VM started to boot.

 

I didn't have time to test with an Ubuntu host but I suspect that it would behave the same way.

 

This is now clearly a driver problem to my eyes.

Michael_L_Intel2
Moderator
96 Views
Hello sone0, Thank you for providing your isolation test of the issue. We need to further check this and get back to you as soon as I have an update. If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
Michael_L_Intel2
Moderator
96 Views
Hello sone0, Can you try ixgbe VF driver 4.5.3? Please open the link below for the dowload link: https://downloadcenter.intel.com/download/18700/Intel-Network-Adapter-Virtual-Function-Driver-for-In... If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
sone0
New Contributor I
96 Views

Hello Mike,

 

I tried with host OS Arch kernel 5.0.8 and guest Fedora 29 VM:

 

Fedora 29 with kernel 4.20.16 compiled and installed the ixgbevf driver successfully and had the exact same problem, complete host lock-up on VM boot.

 

Fedora 29 with kernel 5.0.7 is unable to compile the driver:

 

[sriovtest@localhost src]$ sudo make install make[1]: Entering directory '/usr/src/kernels/5.0.7-200.fc29.x86_64' CC [M] /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_main.o CC [M] /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.o /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.c: In function ‘ixgbevf_diag_test’: /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.c:1028:4: error: too few arguments to function ‘dev_open’ dev_open(netdev); ^~~~~~~~ In file included from /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.c:9: ./include/linux/netdevice.h:2623:5: note: declared here int dev_open(struct net_device *dev, struct netlink_ext_ack *extack); ^~~~~~~~ make[2]: *** [scripts/Makefile.build:277: /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.o] Error 1 make[1]: *** [Makefile:1580: _module_/home/sriovtest/Downloads/ixgbevf-4.5.3/src] Error 2 make[1]: Leaving directory '/usr/src/kernels/5.0.7-200.fc29.x86_64' make: *** [Makefile:49: default] Error 2

 

Michael_L_Intel2
Moderator
96 Views
Hello sone0, Thank you for the update. Let me check this get back to you. If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation
Michael_L_Intel2
Moderator
96 Views
Hello sone0, Upon further checking, here are the tested Host OS for driver version 5.5.5, it also adds support to kernel 4.20. - RHEL* 6.10 - RHEL 7.6 - SLES* 12SP4 - SLES 15 - Ubuntu* 18.04 https://downloadcenter.intel.com/download/14687/Ethernet-Intel-Network-Adapter-Driver-for-PCIe-Intel... Have you tried any of the OS listed above with kernel 4.20? If you have questions, pleas let us know. Best regards, Michael L. Intel Customer Support Under Contract to Intel Corporation