Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4853 Discussions

Linux KVM SRIOV: Spoofed packets, dropped frames

JD8
Beginner
6,340 Views

The primary issue is after several hours to upwards of a couple weeks a single VF will get into a bad state for a guest and we will see the following errors on the parent and child.

Versions:

Centos = 7.5.1804

Kernel = 4.4.121-1.el7.centos.x86_64 (Current); Tried 3.10.0, 4.4.75, 4.9.52, 4.14.68

IXGBE = 5.3.7 (Current); Tried 5.3.5, 4.2.1-k, ......

IXGBEVF = 4.3.5 (Current); Tried 2.12.1-k, ....

QEMU = 1.5.3 (Current); Tried 2.0.0

Libvirt = 3.9.0 (Current)

On the parent we will see this error:

ixgbe 0000:05:00.0 ethx: 193 Spoofed packets detected

 

ixgbe 0000:05:00.0 ethx: 45 Spoofed packets detected

 

ixgbe 0000:05:00.0 ethx: 3 Spoofed packets detected

 

ixgbe 0000:05:00.0 ethx: 126 Spoofed packets detected

On the child you will see an increase in dropped packets.

2: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

 

link/ether 52:54:00:5e:a9:f8 brd ff:ff:ff:ff:ff:ff

 

RX: bytes packets errors dropped overrun mcast

 

455429589913 520093667 0 375674 0 375680

 

TX: bytes packets errors dropped carrier collsns

 

463147231075 514071570 0 0 0 0

I don't have a way to view the spoofed packets going out, but I can see the incoming packets getting corrupted and dropped by the guest. Best example is an ARP since it will hit every parent, child. (IPs censored)

Parent capture:

10:36:26.492879 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has ZZZ.ZZZ.ZZZ.ZZZ tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.540880 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has BBB.BBB.BBB.BBB tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.553161 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has AAA.AAA.AAA.AAA tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.559508 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has YYY.YYY.YYY.YYY tell XXX.XXX.XXX.XXX, length 46

Child Capture:

10:36:26.501491 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has ZZZ.ZZZ.ZZZ.ZZZ tell XXX.XXX.XXX.XXX, length 46

 

10:36:26.549499 00:00:00:00:00:00 > 00:00:00:00:00:00, 802.3, length 0: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Command, ctrl 0x0000: Information, send seq 0, rcv seq 0, Flags [Command], length 46

 

0x0000: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0020: 0000 0000 0000 0000 0000 0000 0000 ..............

 

10:36:26.561776 00:00:00:00:00:00 > 00:00:00:00:00:00, 802.3, length 0: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Command, ctrl 0x0000: Information, send seq 0, rcv seq 0, Flags [Command], length 46

 

0x0000: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................

 

0x0020: 0000 0000 0000 0000 0000 0000 0000 ..............

 

10:36:26.568122 02:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has YYY.YYY.YYY.YYY tell XXX.XXX.XXX.XXX, length 46

During the time this one VF is in a bad state, all other guests will see the same packets as the parent. The only current solution is to reboot the guest. Sometimes destroy the guest and start it back up.

0 Kudos
25 Replies
idata
Employee
543 Views

Hello Jamesdblea,

 

 

Please let us know if you have installed the new expansion card. Any errors seen on the parents? If you have any questions please let us know.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
idata
Employee
543 Views

Hello Jamesdblea,

 

 

Checking if there are any updates; please let us know. If you have any questions please do not hesitate to ask.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
JD8
Beginner
543 Views

Upgrading QEMU 2.10.0 seems to of fixed our problem. I have it running on 12 KVM hosts solidly for 2+ weeks.

No more assistance is needed.

0 Kudos
idata
Employee
543 Views

Hello Jamesdblea,

 

 

Thank you for following up with this. Glad you were able to solve the issue. Let us know if anything else comes up.

 

 

Best regards,

 

Daniel D

 

Intel Customer Support
0 Kudos
zsun5
Beginner
543 Views

Is there an analysis for the final reason?

I have encountered the same problem now, Redhat7.2+Intel 82599+ 20 vms

When a vm sends a fake MAC address packet, the other VMs on the same host will be damaged and lost. The NFS service on the host will also have network problems, and ssh will be disconnected frequently.

Because NFS on host also has frequent interruptions in reading and writing files, I think the possibility of being associated with qemu is very low.

0 Kudos
Reply