Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4810 Discussions

Linux KVM SRIOV: Spoofed packets, dropped frames

JD8
Beginner
5,679 Views

The primary issue is after several hours to upwards of a couple weeks a single VF will get into a bad state for a guest and we will see the following errors on the parent and child.

Versions:

Centos = 7.5.1804

Kernel = 4.4.121-1.el7.centos.x86_64 (Current); Tried 3.10.0, 4.4.75, 4.9.52, 4.14.68

IXGBE = 5.3.7 (Current); Tried 5.3.5, 4.2.1-k, ......

IXGBEVF = 4.3.5 (Current); Tried 2.12.1-k, ....

QEMU = 1.5.3 (Current); Tried 2.0.0

Libvirt = 3.9.0 (Current)

On the parent we will see this error:

ixgbe 0000:05:00.0 ethx: 193 Spoofed packets detected

 

ixgbe 0000:05:00.0 ethx: 45 Spoofed packets detected

 

ixgbe 0000:05:00.0 ethx: 3 Spoofed packets detected

 

ixgbe 0000:05:00.0 ethx: 126 Spoofed packets detected

On the child you will see an increase in dropped packets.

2: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

 

link/ether 52:54:00:5e:a9:f8 brd ff:ff:ff:ff:ff:ff

 

RX: bytes packets errors dropped overrun mcast

 

455429589913 520093667 0 375674 0 375680

 

TX: bytes packets errors dropped carrier collsns

 

463147231075 514071570 0 0 0 0

I don't have a way to view the spoofed packets going out, but I can see the incoming packets getting corrupted and dropped by the guest. Best example is an ARP since it will hit every parent, child. (IPs censored)

Parent capture:

10:36:26.492879 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has ZZZ.ZZZ.ZZZ.ZZZ tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.540880 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has BBB.BBB.BBB.BBB tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.553161 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has AAA.AAA.AAA.AAA tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.559508 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has YYY.YYY.YYY.YYY tell XXX.XXX.XXX.XXX, length 46

Child Capture:

10:36:26.501491 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has ZZZ.ZZZ.ZZZ.ZZZ tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.549499 00:00:00:00:00:00 > 00:00:00:00:00:00, 802.3, length 0: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Command, ctrl 0x0000: Information, send seq 0, rcv seq 0, Flags [Command], length 46

 

0x0000: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0020: 0000 0000 0000 0000 0000 0000 0000 ..............

 

10:36:26.561776 00:00:00:00:00:00 > 00:00:00:00:00:00, 802.3, length 0: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Command, ctrl 0x0000: Information, send seq 0, rcv seq 0, Flags [Command], length 46

 

0x0000: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0020: 0000 0000 0000 0000 0000 0000 0000 ..............

 

10:36:26.568122 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has YYY.YYY.YYY.YYY tell XXX.XXX.XXX.XXX, length 46

During the time this one VF is in a bad state, all other guests will see the same packets as the parent. The only current solution is to reboot the guest. Sometimes destroy the guest and start it back up.

0 Kudos
25 Replies
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for posting in Intel Wired Ethernet Communities. What Ethernet adapter or controller are you currently using? Please post the output of lspci | grep Ethernet; ethtool -i ; and ethtool -k. If you have any other questions please do not hesitate to ask.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
JD8
Beginner
3,007 Views

Here is the last two examples I have.

# lspci | grep Ethernet

 

81:00.0 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Network Connection (rev 01)

 

81:00.1 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Network Connection (rev 01)

 

81:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:10.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:10.7 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:11.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:11.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:11.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:11.7 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:12.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:12.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:12.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:12.7 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:13.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:13.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

 

81:13.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

# lspci -v -s 81:00.0

 

81:00.0 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Network Connection (rev 01)

 

Subsystem: Intel Corporation Ethernet Server Adapter X520-2

# ethtool -i ethx

 

driver: ixgbe

 

version: 5.3.7

 

firmware-version: 0x61c10001

 

expansion-rom-version:

 

bus-info: 0000:81:00.1

 

supports-statistics: yes

 

supports-test: yes

 

supports-eeprom-access: yes

 

supports-register-dump: yes

 

supports-priv-flags: yes

 

# ethtool -k ethx

 

Features for ethx:

 

Cannot get device udp-fragmentation-offload settings: Operation not supported

 

rx-checksumming: on

 

tx-checksumming: on

 

tx-checksum-ipv4: off [fixed]

 

tx-checksum-ip-generic: on

 

tx-checksum-ipv6: off [fixed]

 

tx-checksum-fcoe-crc: on [fixed]

 

tx-checksum-sctp: on

 

scatter-gather: on

 

tx-scatter-gather: on

 

tx-scatter-gather-fraglist: off [fixed]

 

tcp-segmentation-offload: on

 

tx-tcp-segmentation: on

 

tx-tcp-ecn-segmentation: off [fixed]

 

tx-tcp-mangleid-segmentation: off

 

tx-tcp6-segmentation: on

 

udp-fragmentation-offload: off

 

generic-segmentation-offload: on

 

generic-receive-offload: on

 

large-receive-offload: off [fixed]

 

rx-vlan-offload: on

 

tx-vlan-offload: on

 

ntuple-filters: off

 

receive-hashing: on

 

highdma: on [fixed]

 

rx-vlan-filter: on

 

vlan-challenged: off [fixed]

 

tx-lockless: off [fixed]

 

netns-local: off [fixed]

 

tx-gso-robust: off [fixed]

 

tx-fcoe-segmentation: on [fixed]

 

tx-gre-segmentation: on

 

tx-gre-csum-segmentation: on

 

tx-ipxip4-segmentation: on

 

tx-ipxip6-segmentation: on

 

tx-udp_tnl-segmentation: on

 

tx-udp_tnl-csum-segmentation: on

 

tx-gso-partial: on

 

tx-sctp-segmentation: off [fixed]

 

tx-esp-segmentation: off [fixed]

 

fcoe-mtu: off [fixed]

 

tx-nocache-copy: off

 

loopback: off [fixed]

 

rx-fcs: off [fixed]

 

rx-all: off

 

tx-vlan-stag-hw-insert: off [fixed]

 

rx-vlan-stag-hw-parse: off [fixed]

 

rx-vlan-stag-filter: off [fixed]

 

l2-fwd-offload: off [fixed]

 

hw-tc-offload: off

 

esp-hw-offload: off [fixed]

 

esp-tx-csum-hw-offload: off [fixed]

 

rx-udp_tunnel-port-offload: on

<span style="caret-color: # 172b4d; color: # 172b4d; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif; font-size: 14px; font-style: no...

0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for the outputs. We will investigate the issue and update you as soon as possible. Please let us know if you have any other questions.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for your patience while we investigate this issue. Libvirt assigns a valid MAC address to the VF prior to use. Due to a security feature in the i40e driver the VF assigned to the VM is not allowed to change the VF MAC address from within the VM. A duplicate MAC will cause loss of communication. Manually assign a MAC address from the host using the following command:

 

ip link set vf 0 mac aa:bb:cc:dd:ee:ff

 

 

See page 15 of the https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/xl710-sr-iov-config-guide-gbe-linux-brief.pdf SR-IOV configuration guide for more information. Please verify that the MAC address is not duplicated on the VF. Provide the steps you take to create the VFs if you need further assistance. If you have any questions please let us know.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
JD8
Beginner
3,007 Views

Hello Daniel,

Thanks for getting back to me. I'm waiting for another server to encounter the problem so I can get you the requested output.

In the meantime I just want to point out the tcpdump from my first post is incoming packets to the guest being corrupted.

Example:

 

From this:

 

10:36:26.540880 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has BBB.BBB.BBB.BBB tell XXX.XXX.XXX.XXX, length 46

To This:

10:36:26.549499 00:00:00:00:00:00 > 00:00:00:00:00:00, 802.3, length 0: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Command, ctrl 0x0000: Information, send seq 0, rcv seq 0, Flags [Command], length 46

0x0000: 0000 0000 0000 0000 0000 0000 0000 0000 ................

0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................

0x0020: 0000 0000 0000 0000 0000 0000 0000 ..............

Traffic will still get through, but with alot of retransmits.

 

Spoofchk can be turned off on the parent side VF to stop the alerts, but packets still get corrupted and retransmits increase.

Trust can also be turned on, but we do not see the MAC address change.

When it happens again I should be able to get the output requested.

Thanks again for looking into the issue.

0 Kudos
JD8
Beginner
3,007 Views

Hell Daniel,

Here is the output for packets being sent from the guest. I've also done a tcpdump to match any packets exiting that don't match the current MAC. Guest with the issue is VF 7.

Parent:

# ip link show dev ethx

6: ethx: mtu 1500 qdisc mq master br0 state UP mode DEFAULT group default qlen 1000

link/ether 90:e2:ba:0f:cb:90 brd ff:ff:ff:ff:ff:ff

vf 0 MAC 52:54:00:f7:95:ed, spoof checking on, link-state auto, trust on, query_rss off

vf 1 MAC 52:54:00:30:1b:67, spoof checking on, link-state auto, trust on, query_rss off

vf 2 MAC 52:54:00:a7:7a:45, spoof checking on, link-state auto, trust on, query_rss off

vf 3 MAC 52:54:00:4a:d3:cb, spoof checking on, link-state auto, trust on, query_rss off

vf 4 MAC 52:54:00:c8:9e:57, spoof checking on, link-state auto, trust on, query_rss off

vf 5 MAC 52:54:00:17:44:c3, spoof checking on, link-state auto, trust on, query_rss off

vf 6 MAC 52:54:00:45:b3:61, spoof checking on, link-state auto, trust on, query_rss off

vf 7 MAC 52:54:00:fa:f7:dc, spoof checking on, link-state auto, trust on, query_rss off

Child:

# ip link show dev eth0

2: eth0: mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000

link/ether 52:54:00:fa:f7:dc brd ff:ff:ff:ff:ff:ff

# ip link show dev bond0

3: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000

link/ether 52:54:00:fa:f7:dc brd ff:ff:ff:ff:ff:ff

# tcpdump -i eth0 -Q out not ether host 52:54:00:fa:f7:dc

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

0 packets captured

5837 packets received by filter

4331 packets dropped by kernel

# tcpdump -i bond0 -Q out not ether host 52:54:00:fa:f7:dc

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes

0 packets captured

2480 packets received by filter

595 packets dropped by kernel

0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for the outputs. We will investigate the issue further, and get back to you soon. If you have any other questions please do not hesitate to contact us.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for waiting while we investigate this issue. Have you tried to change the child to a VF other than VF7 and see if the issue still occurs on another VF? Would you be able to provide us some of the configurations you used. Specifically QEMU and KVM settings, and configurations for the Ethernet interfaces. We also notice bond0 is set, but what is it being used for? Let us know if you have any questions.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
JD8
Beginner
3,007 Views

Hello Daniel,

The actual VF used this time was 7, next time it'll be another, next another. I'm troubleshooting this issue popping up on more than several dozen servers. No correlation on VF# .

Bond is used in a few cases where guests are on parents with a bonded network and SRIOV from two nics to the guest. This case there is only one network connection; bond isn't backed up by two. Again this error can happen with or without bond.

Here is some configs we use for passthrough, if you need some more specific, please let me know.

Guest passthrough libvirt:

QEMU process:

qemu 31168 246 12.7 8947996 8404636 ? SLl Oct04 4024:19 /usr/libexec/qemu-kvm -name GUESTNAME -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+dca,+osxsave,+arat,+xsaveopt,+pdpe1gb,-spec-ctrl -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid fc9c3820-a1ab-4cab-8671-717d4e2560c0 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-29-GUESTNAME/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x3.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x3 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x3.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x3.0x2 -device ahci,id=sata0,bus=pci.0,addr=0x4 -drive file=/imgs/GUESTNAME/guest.disk0,format=raw,if=none,id=drive-sata0-0-0,discard=unmap -device ide-hd,bus=sata0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device vfio-pci,host=05:11.2,id=hostdev0,bus=pci.0,addr=0x6 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for the reply. We will take the Libvirt and QEMU configurations into consideration. Will update you soon. Let us know if you have any questions.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for your patience. Using an 82599 2 port Ethernet controller on a machine running CentOS 7.5 (Parent) and Ubuntu 15 (Child) we could not duplicate the issue. We followed the steps in https://www.intel.com/content/www/us/en/embedded/products/networking/xl710-sr-iov-config-guide-gbe-linux-brief.html SR-IOV configuration Guide and enabled 2x VF on port 0. No frames were dropped child to another host or parent to child. Is there something in your configuration that deviates from the guide? Let us know if you have any questions.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Please let us know if you were able to solve this issue using the guide as reference. If you have any questions please do not hesitate to ask.

 

 

Best regards,

Daniel D

 

Intel Customer Support
0 Kudos
JD8
Beginner
3,007 Views

Hello Daniel,

The issue continues. I unfortunately have no way to trigger the problem so the only way for me to continue troubleshooting is wait for one of the KVM parent/guest to get into that defunct state. This can take anywhere from days to weeks until it triggers.

If you have any suggestions on troubleshooting steps I could perform, or commands to run when I encounter the issue, I could get you more information.

0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Thank you for the reply. We will check if there is anything else we can do to diagnose the issue and update you soon. Let us know if you have any questions.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
idata
Employee
3,007 Views

Hi jamesdblea,

Please let us know how many child are running simultaneously on the parent and have you tried to replace the Intel® Ethernet Adapter X520-T2 with another network adapter?

 

Best Regards,

Vince T.

 

Intel Customer Support
0 Kudos
JD8
Beginner
3,007 Views

Hello Vince,

The number of guests ranges based on CPU/RAM; there is a number of deployments that range from 4,8 and 12

guests that have experienced this issue.

I'm working on getting an expansion card installed or swapped out on a number of KVM parents. Once this is complete I'll report back any findings.

0 Kudos
idata
Employee
3,007 Views

Hi jamesdblea,

 

 

Thanks for the update. We'll wait for your post.

 

 

Best Regards,

 

 

Vince T.

 

Intel Customer Support
0 Kudos
idata
Employee
3,007 Views

Hello Jamesdblea,

 

 

Any updates on the status using another adapter? Please let us know if you have any questions.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
JD8
Beginner
3,007 Views

Hello Daniel

Sorry, no new updates. The hardware replacement has not been completed. I have a few parents with updated software I'm still waiting to see if they will error out.

0 Kudos
idata
Employee
2,495 Views

Hello Jamesdblea,

 

 

Thank you for the update. Please inform us when you have new information. Let us know if any questions come up.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
Reply