I am running Arch Linux version 5.0.5 with Intel ixgbe Inbox kernel driver version 5.1.0
SR-IOV is enabled in BIOS, amd_iommu is enabled in kernel, maxvfs is set to 1 (so 1 VF per port, total of two) and the ixgbevf driver is blacklisted and vfio-pci driver is assigned to handle the VFs instead.
Every time I try to boot an OS in a VM with a VF passed through in KVM, the host system hangs and does not respond to any input or ssh and needs to be reset. I have tried with Ubuntu 18.04, Fedora 29 and Windows 10 1809 as guest VM OSes. Seems like an ixgbe driver problem to me.
Hello Mike! It appears that some Linux update has fixed my problems!
I tried and verified with:
Host: Arch Linux kernel version 5.1.4, ixgbe Inbox version 5.1.0-k
Guest.1: Fedora 29 VM kernel version 5.0.17, ixgbevf Inbox version 4.1.0
Guest.2: Fedora 29 VM kernel version 4.18.xx, ixgbevf Inbox version 5.1.0
Guest.3: Windows 10 VM, Intel driver version 23.5.2 for Windows Server 2016/2019
Everything seems to be working properly although Windows performance might be lower than expected. I will investigate further but I have had no lock-ups or other weird behavior.
Hello Mike, thank you for the rapid reply!
I am using a custom system with an AsRock X399 Taichi motherboard and an AMD Threadripper 2920x CPU, both of which support SR-IOV.
I do not have a second Intel NIC, I have tried with a Mellanox ConnectX-3 10Gb NIC and I was able to successfully pass through a VF to a Linux guest and verify connectivity. So SR-IOV is functional on my platform, at least for the Mellanox card.
My card is a Fujitsu D2755 (Intel 82599ES based) NIC.
I found other users reporting the same problem with Linux hosts with kernels >= 4.20. They have reported that it used to work with kernels up to 4.19 but exhibit the same behavior with 4.20 or newer. One particular user also compiled the 5.5.5 ixgbe out of tree driver from Intel and had the same problem with kernels 4.20 or newer.
I can confirm this. I have two hosts, one with 82599es , another with X552, both use ixgbe driver. VMs with VF passed through cannot boot. On one host, when starting the VM, there is no ovmf logo, only a black screen. On the other host, the host locks up as soon as the virtual machine is started.
The problem starts after the 4.20 kernel update. If I reverted to any pre 4.20 kernel, sr-iov works very well
My nics are not the same brand as sone0's,so I think there is something wrong with the ixgbe driver.
Hello Mike, I will get in contact with Fujitsu support but as XFei01 mentioned above, this happens with other brands of NICs as well. I have tried 5.5.5 Out of tree driver and still experience the same problem. Please take the time and try to replicate it to see what happens. Thanks again
Hey again Mike, I managed to do some testing with Fedora 29 as host OS and kernels 4.18, 4.19, 4.20 and 5.0.7.
Kernels 4.18 and 4.19 worked as expected and the Ubuntu 18.04 VM had a Virtual Interface NIC and full access to the internet.
Kernels 4.20 and 5.0.7 exhibited the problem, with the host completely locking up when the VM started to boot.
I didn't have time to test with an Ubuntu host but I suspect that it would behave the same way.
This is now clearly a driver problem to my eyes.
I tried with host OS Arch kernel 5.0.8 and guest Fedora 29 VM:
Fedora 29 with kernel 4.20.16 compiled and installed the ixgbevf driver successfully and had the exact same problem, complete host lock-up on VM boot.
Fedora 29 with kernel 5.0.7 is unable to compile the driver:
[sriovtest@localhost src]$ sudo make install make: Entering directory '/usr/src/kernels/5.0.7-200.fc29.x86_64' CC [M] /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_main.o CC [M] /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.o /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.c: In function ‘ixgbevf_diag_test’: /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.c:1028:4: error: too few arguments to function ‘dev_open’ dev_open(netdev); ^~~~~~~~ In file included from /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.c:9: ./include/linux/netdevice.h:2623:5: note: declared here int dev_open(struct net_device *dev, struct netlink_ext_ack *extack); ^~~~~~~~ make: *** [scripts/Makefile.build:277: /home/sriovtest/Downloads/ixgbevf-4.5.3/src/ixgbevf_ethtool.o] Error 1 make: *** [Makefile:1580: _module_/home/sriovtest/Downloads/ixgbevf-4.5.3/src] Error 2 make: Leaving directory '/usr/src/kernels/5.0.7-200.fc29.x86_64' make: *** [Makefile:49: default] Error 2