Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
5305 Discussions

BPF cannot match VLAN ID, unique to ixgbe (82599ES) on RHEL/CentOS 6.8

CChor
Beginner
4,204 Views

I have an extremely odd situation with BPF expressions and VLAN tagging when using ixgbe devices on CentOS 6 (this is for Network Intrusion Detection software).

I have many other CentOS 6 boxes with igb devices and I'm successfully using complex BPF expressions to match VLAN IDs and port numbers, for example:

vlan and ( ether[14:2] == 0x0005 and port 25 || ether[14:2] == 0x0006 and port 25 )

Which means: Look for packets that are VLAN tagged, where the VLAN ID is 5 and the transport protocol port number is 25, or VLAN ID is 6 and the port is 25. The is necessary due to the way BPF works, which is it will shift the byte match by 4 bytes after each occurrence of "vlan" in the expression (hence you cannot construct an expression like (vlan 5 and port 25 || vlan 6 and port 25), because the second half of the expression would only match if traffic for VLAN 6 was encapsulated inside VLAN 5). Statically using ether[a:b] will always look a-bytes deep in the packet (starting at 0) and read b-number of bytes to match. This is supposed to work regardless of whether "vlan" has occurred in the expression (making it a reliable way to find the outer VLAN ID).

So here's the problem: It doesn't work on ixgbe (at least not in promiscuous mode). When I try to match on ether[15] (the low-order bytes of the VLAN ID) it will actually match on byte 19 (the 20th byte), which mean it's shifted 4 bytes over. If I try to match on ether[11], the expression returns true when *both* byte 11 (12) AND byte 15 (16) equal the expression, which is totally bizarre. I cannot seem to make a 2-byte pattern match at all (but maybe I just didn't run tcpdump for long enough for that amazing coincidence).

By the way, I can match any other bytes normally with ether[a:b] expressions, it's only the 12-15 bytes (VLAN ethertype and VLAN ID) that have bizarre behavior.

I strongly suspect this is due to rx_vlan_offload being enabled, but when I try to disable it with ethtool I get:

$ sudo ethtool -K eth1 rx-vlan-offload off

ethtool: bad command line argument(s)

For more information run ethtool -h

Edit: I found that 'ethtool -K rxvlan off' is the correct command, but disabling that didn't change the behavior.

This happens with both ixgbe driver version: 4.2.1-k (shipped with the CentOS kernel package), and also with version: 4.4.6 (built from source).

I found a reference to an extremely similar sounding bug here https://sourceforge.net/p/e1000/bugs/375/ https://sourceforge.net/p/e1000/bugs/375/, but that seems to be a much earlier version of the driver. This one appears to check for RHEL_RELEASE_VERSION > 6.1 and enable 802.1P support accordingly:

# if (RHEL_RELEASE_CODE && RHEL_RELEASE_CODE >= RHEL_RELEASE_VERSION(6,1))

# define HAVE_8021P_SUPPORT

# endif

So two questions:

1. How can I resolve this issue of not being able to use ether[14:2] to match VLAN ID in BPF?

2. Why can't I disable rx_vlan_offload with ethtool? (it does not say [fixed])

Message was edited by: Chort Zero

Someone on Twitter showed me 'rxvlan' is ethtool's shorthand for rx-vlan-offload, but it doesn't make any difference after disabling it.

0 Kudos
17 Replies
idata
Employee
2,840 Views

Hi Chort,

 

 

What is the network adapter you used here? I need to further check on this.

 

 

rgds,

 

wb

 

0 Kudos
CChor
Beginner
2,840 Views

Here are the chips:

01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

01:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

07:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

07:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

Dell lists it on the BOM as "Intel X520 DP 10Gb DA/SFP+, + I350 DP 1Gb Ethernet, Network Daughter Card (540-BBBB)." I think it is "Intel® Ethernet Network Daughter Card X520-DA2 /1350-T2," or an earlier model of it.

0 Kudos
CChor
Beginner
2,840 Views

Actually, I think this might be a libpcap bug on CentOS 6.

Edit: rolling back libpcap did not fix it for the 82599, nor do going forward to the very latest libpcap.

I noticed that on the machine that wasn't working my libpcap was this:

libpcap-1.4.0-4.20130826git2dbcaa1.el6.x86_64

libpcap-devel-1.4.0-4.20130826git2dbcaa1.el6.x86_64

So I tried updating an IDS sensor that was working correctly (one with an igb, I350 card) to those package versions and then I was unable to match VLAN tags correctly.

When I rolled back to these packages:

libpcap-devel-1.4.0-1.20130826git2dbcaa1.el6.x86_64.rpm

libpcap-1.4.0-1.20130826git2dbcaa1.el6.x86_64.rpm

It started working again (for the sensor capturing with I350).

I see in the changelogs for libpcap a bug fix that seems to match the description of this problem:

http://www.tcpdump.org/libpcap-changes.txt http://www.tcpdump.org/libpcap-changes.txt

Wednesday Nov. 12, 2014 mailto:guy@alum.mit.edu guy@alum.mit.edumailto:/mcr@sandelman.ca /mcr@sandelman.ca

Summary for 1.7.0 libpcap release

Fix handling of zones for BPF on Solaris

new DLT for ZWAVE

clarifications for read timeouts.

Use BPF extensions in compiled filters, fixing VLAN filters <---

some fixes to compilation without stdint.h

EBUSY can now be returned by SNFv3 code.

Fix the range checks in BPF loads

Various DAG fixes.

Various Linux fixes.

0 Kudos
CChor
Beginner
2,840 Views

Sadly, installing the older libpcap packages (and restarting associated processes) did not resolve the issue with BPF expressions not working on the ixgbe interface.

I'm kind of at a loss.

0 Kudos
CChor
Beginner
2,840 Views

Built and installed the latest versions of libpcap and tcpdump, still doesn't work to match "vlan and ether[14:2]" expressions.

http://www.tcpdump.org/# latest-release TCPDUMP/LIBPCAP public repository

0 Kudos
CChor
Beginner
2,840 Views

Any update here? We're completely unable to use the 82599 for Network Intrusion Detection as it stands, because it cannot support complex expressions with multiple VLAD ID conditions.

Things that did not work:

1. Disabling rx-vlan-offload

2. Downgrading or upgrading libpcap and pointing tcpdump to use these specific versions

3. Disabling vxlan (using 4.4.6 driver)

0 Kudos
idata
Employee
2,840 Views

Hi Chort,

 

 

Thank you for the additional information. We would like to further clarify

 

are you referring to double (or nested) vlans? If yes, then 82599 doesn't support nested vlans.

 

 

rgds,

 

wb

 

0 Kudos
CChor
Beginner
2,840 Views

We aren't using nested VLANs (assuming you mean encapsulating VLANs inside VLANS), no.

We have a flat set of VLANs that are passing through passive network taps. We're duplicating the aggregated tap output (without any encapsulation or rewriting) to servers with 82599 cards in order to perform traffic inspection. Since this is a whole lot of traffic, we're filtering out a bunch of things that we know we don't have to look at, using BPF expressions (which is the most efficient way for our IDS technology to do it). BPF, as you may be aware, is the same thing tcpdump uses to capture/exclude specific traffic.

Due to a quirk of BPF, you cannot have multiple conditions based on the 'vlan' keyword, because invoking it causes the byte pointer to be shifted over 4. At this point if you invoke 'vlan' again it will only match VLAN-in-VLAN traffic (encapsulated VLANs). The way BPF gets around this to construct complex expressions (such as in plain language "match VLAN 5 port 80, OR match vlan 15 port 8080") is to explicitly read to an offset in the ethernet frame with 'ether[:]'. This technique is what is not working on the 82599. It works great on the I350, and every other card I've ever used. We literally copied the same filter rules we're using on I350s and they don't work on 82599s. We get the same behavior when using tcpdump or the IDS software, so it's not a bug in either of those. It seemed possible that it was a bug in libpcap, but I seem to have ruled that out by both downgrading and upgrading libpcap on the box with the 82599s and it hasn't made any difference.

0 Kudos
idata
Employee
2,840 Views

 

Hi Chort,

 

 

Thank you for the clarification. I will further investigate.

 

 

Rgds,

 

wb
0 Kudos
idata
Employee
2,840 Views

Hi Chort,

 

Further checking, this can be a mistake in the rules you are using. The bpf might be wrong, so you need to check with the bpf maintainer.

 

 

thanks,

 

wb

 

0 Kudos
CChor
Beginner
2,840 Views

It's not a mistake with the BPF. The exact same BPF is working perfectly on I350 cards on the same networks. I stated that several times in this thread already. I have done literally hours of troubleshooting on this and documented nearly all of it in this thread. It's frankly insulting to tell me it's a BPF error given the information I've already provided.

This appears to me to be a bug with either the driver, the firmware, or the chip.

0 Kudos
idata
Employee
2,840 Views

Hi Chort,

 

 

Thank you for the clarification.

 

 

rgds

 

wb

 

0 Kudos
CChor
Beginner
2,840 Views

Any update? If the 82599 cards cannot do BPF matching on VLAN tags we will have to rip them out and replace with a different vendor.

0 Kudos
CChor
Beginner
2,840 Views

By the way, I've tried a second server with the same chipset on a different link and I have the same results, i.e. libpcap applications cannot access bytes 12-15 for matching BPF expressions. That rules out a faulty card.

This appears to be a incorrect interaction between this chip/firmware/driver and Linux (have not tried a different OS nor am in the position to be able to).

This cannot solely be a Linux / libpcap issue, because this technique works on igb (I350) cards.

0 Kudos
CChor
Beginner
2,840 Views

It now seems to be working as of kernel version: 2.6.32-642.6.2.el6.centos.plus.x86_64 # 1 SMP Wed Oct 26 06:49:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux, after updating /sbin/ifup-local to this:

# !/bin/sh

if [ X"$1" == "X" ]; then

echo 'No device given' >&2

exit

fi

DEVICE=$1

# For Intel 2x10Gb + 2x1Gb cards, we capture from eth{0,1}

if [ "${DEVICE}" == "eth0" ] || [ "${DEVICE}" == "eth1" ]; then

echo 'Applying settings from /sbin/ifup-local...'

for i in rx tx gso lro tso gro rxvlan txvlan rx-vlan-filter; do

/usr/sbin/ethtool -K ${DEVICE} $i off

done

/usr/sbin/ethtool -N ${DEVICE} rx-flow-hash udp4 sdfn

/sbin/ifconfig ${DEVICE} promisc

fi

0 Kudos
idata
Employee
2,840 Views

Hi Chort,

 

Thank you for the update and glad it seems to work. It is recommended still to contact the maintainer of libpcap to further verify and assist on this.

 

 

rgsd,

 

wb

 

0 Kudos
CChor
Beginner
2,840 Views
0 Kudos
Reply