- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am performing throughput tests on OpenStack virtual machines with SR-IOV interfaces using iperf3, and I'm getting very varying results seemingly at random every time I start a new iperf3 client. Sometimes I get the maximum throughput I theoretically expect to get (in comparison to native and VirtIO throughput), but other times I get other results going to as low as around 1 Gbps. The tests are done between the SR-IOV interfaces of two virtual machines through a 10 Gbps link. Throughput with VirtIO interfaces via the same link is ~9.39 Gbps. I have six virtual machines with a SR-IOV interface each across four compute nodes with the same NIC model (Intel Corporation Ethernet Converged Network Adapter X710), and I get a similar behavior in all of them.
Here are some excerpts of iperf3 tests, all executed within ~2 minutes (it can easily go from 9.39 Gbps to 1 Gbps between consecutive tests, and vice versa):
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 382 MBytes 3.20 Gbits/sec 238 489 KBytes
[ 4] 1.00-2.00 sec 1.05 GBytes 9.04 Gbits/sec 12 629 KBytes
[ 4] 2.00-3.00 sec 1.01 GBytes 8.69 Gbits/sec 0 669 KBytes
[ 4] 3.00-4.00 sec 992 MBytes 8.32 Gbits/sec 2 529 KBytes
[ 4] 4.00-5.00 sec 1.03 GBytes 8.86 Gbits/sec 0 539 KBytes
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 765 MBytes 6.42 Gbits/sec 201 496 KBytes
[ 4] 1.00-2.00 sec 179 MBytes 1.50 Gbits/sec 0 609 KBytes
[ 4] 2.00-3.00 sec 149 MBytes 1.25 Gbits/sec 0 660 KBytes
[ 4] 3.00-4.00 sec 150 MBytes 1.25 Gbits/sec 9 588 KBytes
[ 4] 4.00-5.00 sec 118 MBytes 988 Mbits/sec 8 550 KBytes
[ 4] 4.00-5.00 sec 1.09 GBytes 9.39 Gbits/sec 0 747 KBytes
[ 4] 5.00-6.00 sec 1.09 GBytes 9.39 Gbits/sec 0 747 KBytes
[ 4] 6.00-7.00 sec 1.09 GBytes 9.39 Gbits/sec 0 747 KBytes
[ 4] 7.00-8.00 sec 1.09 GBytes 9.38 Gbits/sec 0 1.08 MBytes
[ 4] 8.00-9.00 sec 1.09 GBytes 9.39 Gbits/sec 0 1.08 MBytes
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 400 MBytes 3.36 Gbits/sec 1200 625 KBytes
[ 4] 1.00-2.00 sec 232 MBytes 1.95 Gbits/sec 0 673 KBytes
[ 4] 2.00-3.00 sec 298 MBytes 2.50 Gbits/sec 1 1.58 MBytes
[ 4] 3.00-4.00 sec 449 MBytes 3.76 Gbits/sec 1 3.09 MBytes
[ 4] 4.00-5.00 sec 487 MBytes 4.09 Gbits/sec 6 3.09 MBytes
Compute nodes have Ubuntu 16.04.6 LTS and Linux Kernel 4.4.0-170-generic. VMs have Ubuntu 18.04.3 LTS and Linux Kernel 4.15.0-72-generic. I use i40e and i40evf drivers, which according to the man page here (https://docs.oracle.com/cd/E86824_01/html/E54777/i40evf-7d.html) has support for this NIC at 10G or 40G. I have SR-IOV, IOMMU, and NUMA topology enabled in BIOS/grub of all compute nodes. The compute nodes are:
Manufacturer: Supermicro
Product Name: SYS-1029P-WTR
I provide additional information in the post below.
Here are some things that I've tried:
- Using other tools to measure throughput: I got similar behavior with iperf2 and nuttcp
- Verifying the traffic arriving to the switch during the tests (in case the 10G switch was somehow the bottleneck): it does match the throughput output by iperf3
- `sudo ethtool -s ens5 speed 10000`: Cannot set new settings: Operation not supported
- `sudo ip link set enp94s0f1 vf 4 rate 10000`, then sending through that VF: same behavior
- `sudo tc qdisc add dev ens5 root fq maxrate 10gbit` in VM: same behavior
- tso off in VM: same behavior (the best throughput results are worse now because I am CPU-bound above ~3 Gbps, but I still get runs of around 1 Gbps)
I would appreciate any suggestion that could perhaps solve this issue or could help me continue troubleshooting it.
Thanks in advance,
Jorge
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. `modinfo i40e` output in the compute node
filename: /lib/modules/4.4.0-170-generic/kernel/drivers/net/ethernet/intel/i40e/i40e.ko
version: 1.4.25-k
license: GPL
description: Intel(R) Ethernet Connection XL710 Network Driver
author: Intel Corporation, <e1000-devel@lists.sourceforge.net>
srcversion: 208E31F868A400B5E539130
alias: pci:v00008086d00001588sv*sd*bc*sc*i*
alias: pci:v00008086d00001587sv*sd*bc*sc*i*
alias: pci:v00008086d000037D2sv*sd*bc*sc*i*
alias: pci:v00008086d000037D1sv*sd*bc*sc*i*
alias: pci:v00008086d000037D0sv*sd*bc*sc*i*
alias: pci:v00008086d000037CFsv*sd*bc*sc*i*
alias: pci:v00008086d000037CEsv*sd*bc*sc*i*
alias: pci:v00008086d00001587sv*sd*bc*sc*i*
alias: pci:v00008086d00001589sv*sd*bc*sc*i*
alias: pci:v00008086d00001586sv*sd*bc*sc*i*
alias: pci:v00008086d00001585sv*sd*bc*sc*i*
alias: pci:v00008086d00001584sv*sd*bc*sc*i*
alias: pci:v00008086d00001583sv*sd*bc*sc*i*
alias: pci:v00008086d00001581sv*sd*bc*sc*i*
alias: pci:v00008086d00001580sv*sd*bc*sc*i*
alias: pci:v00008086d00001574sv*sd*bc*sc*i*
alias: pci:v00008086d00001572sv*sd*bc*sc*i*
depends: ptp,vxlan
retpoline: Y
intree: Y
vermagic: 4.4.0-170-generic SMP mod_unload modversions
parm: debug:Debug level (0=none,...,16=all) (int)
2. `lspci -vv` output in the compute node
2.1. PF
5e:00.2 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
Subsystem: Intel Corporation Ethernet Converged Network Adapter X710
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 265
Region 0: Memory at c3800000 (64-bit, prefetchable) [size=8M]
Region 3: Memory at c5808000 (64-bit, prefetchable) [size=32K]
Expansion ROM at c5d80000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: i40e
Kernel modules: i40e
2.2. VF (one of them)
5e:06.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 01)
Subsystem: Intel Corporation XL710/X710 Virtual Function
Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Region 0: [virtual] Memory at c5400000 (64-bit, prefetchable) [size=64K]
Region 3: [virtual] Memory at c5920000 (64-bit, prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: i40evf
Kernel modules: i40evf
3. Ip link output in compute node
5: enp94s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid 3cfdfebe7561 state UP mode DEFAULT group default qlen 10000
link/ether 3c:fd:fe:be:75:61 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 4 MAC fa:16:3e:f7:48:fb, vlan 892, spoof checking on, link-state enable
vf 5 MAC fa:16:3e:7b:a0:11, vlan 891, spoof checking on, link-state enable
vf 6 MAC fa:16:3e:fd:20:a9, vlan 896, spoof checking on, link-state enable
4. `modinfo i40evf` output in a virtual machine
filename: /lib/modules/4.15.0-72-generic/kernel/drivers/net/ethernet/intel/i40evf/i40evf.ko
version: 3.0.1-k
license: GPL
description: Intel(R) XL710 X710 Virtual Function Network Driver
author: Intel Corporation, <linux.nics@intel.com>
srcversion: 18F6112020282F659B005E2
alias: pci:v00008086d00001889sv*sd*bc*sc*i*
alias: pci:v00008086d000037CDsv*sd*bc*sc*i*
alias: pci:v00008086d00001571sv*sd*bc*sc*i*
alias: pci:v00008086d0000154Csv*sd*bc*sc*i*
depends:
retpoline: Y
intree: Y
name: i40evf
vermagic: 4.15.0-72-generic SMP mod_unload
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4
5. `lspci -vv` output of SR-IOV interface in a virtual machine
00:05.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 01)
Subsystem: Intel Corporation Ethernet Virtual Function 700 Series
Physical Slot: 5
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Region 0: Memory at fe000000 (64-bit, prefetchable) [size=64K]
Region 3: Memory at fe018000 (64-bit, prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: i40evf
Kernel modules: i40evf
6. ethtool output of SR-IOV interface in a virtual machine
Settings for ens5:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 40000Mb/s
Duplex: Full
Port: None
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for posting on Intel® Ethernet Communities page
Please allow us some time to check on this. We will give you an update in 2 to 3 business days.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for waiting for our update.
After checking your concern, we would like to ask if you are experiencing the same issue when using our latest PF and VF drivers that can be downloaded here:
PF (physical function) - https://downloadcenter.intel.com/download/24411/Intel-Network-Adapter-Driver-for-PCIe-40-Gigabit-Ethernet-Network-Connections-Under-Linux-?product=36773
VF (virtual function) - https://downloadcenter.intel.com/download/24693/Intel-Network-Adapter-Virtual-Function-Driver-for-Intel-Ethernet-Controller-700-Series?product=36773
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello AlfredS,
I appreciate the quick response. Unfortunately, I still experience the same behavior. I installed the new PF drivers in the compute nodes and loaded them. I did the same in the VMs with the VF drivers, removing the i40evf driver and loading the new one. I tried both changing the drivers in existing VMs and spawning two new VMs. However, I ran a few iperf3 tests, nothing seemingly has changed. Throughput still goes back and forth in just seconds between less than 1 Gbps to 9 Gbps.
Thanks,
Jorge
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
I deeply appreciate your swift reply.
Please allow us some time to further check on this. We will provide you with an update no later than 3 business days from now.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for patiently waiting for an update.
We are still investigating your concern.
We need additional information that can aid us in further checking your concern. Please provide the following information:
a. Please share the “ethtool –i” output of the X710 adapter on the nodes.
b. Please provide the iperf3 command that you used for testing
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thank you for the reply.
The iperf3 command that I used was simply `iperf3 -s <ipv4_addr>` in the server and `iperf3 -c <ipv4_addr>` in the client. Also tried with `iperf3 -c <ipv4_addr> -b 10G` but the results were the same (if not worse).
The output of `ethtool -i` in the compute node interface is:
driver: i40e
version: 2.10.19.30
firmware-version: 6.01 0x800035ce 1.1747.0
expansion-rom-version:
bus-info: 0000:5e:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
And below in the SR-IOV interface of the virtual machine, in case it's relevant:
driver: iavf
version: 3.7.61.20
firmware-version: N/A
expansion-rom-version:
bus-info: 0000:00:05.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
Thank you,
Jorge
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for sharing that information.
Please allow us some time to look into this. We will get back to you no later than 3 to 5 business days.
Best Regards,
Alfred S
Intel® Customer Support Technician
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Due to the issue's complexity, please allow us more time to investigate your concern.
We will get back to you no later than 3 to 5 business days.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thank you for continuing to look into this. Please, take the necessary time.
Regards,
Jorge
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
You are most welcome.
We also need some additional information from you.
Please provide the PBA# found on the 4-in-1 label of his X710 adapter.
A sample 4-in-1 label can be located in the site below, for your reference.
https://www.intel.com/content/www/us/en/support/articles/000007060/network-and-i-o/ethernet-products.html
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Unfortunately, I don't have physical access to the servers at the moment. I was able to check the Vendor ID/Device ID of the adapters, which is 8086:1572, and according to the table here (https://www.intel.co.uk/content/www/uk/en/support/articles/000005612/network-and-i-o/ethernet-products.html) matches to PBA J11367 or J11365.
Please, let me know if this information is enough.
Thanks,
Jorge
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for providing this information.
We will continue to investigate your concern. We will update you after 2 to 3 business days.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for your patience.
The latest NVM update for your adapter has just been released. We would suggest to try this out and feel free to update us for any findings.
Looking forward to your reply.
Best Regards,
Crisselle C
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thanks for the reply. I ran the NVM update tool on all the compute nodes and it completed successfully, but, unfortunately, I still get the same performance results.
Here's the output of ethtool -i after the NVM update:
$ ethtool -i enp94s0f0
driver: i40e
version: 2.10.19.30
firmware-version: 7.10 0x80006471 1.2527.0
expansion-rom-version:
bus-info: 0000:5e:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for carrying out the firmware update.
Since it did not resolve your issue, we will continue to investigate and further check on your concern.
We will get back to you no later than 3 to 5 business days from now.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Our engineering team still is looking at this case and we'll get to you as soon as possible.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thanks for the reply. I would like to add a couple of things.
First, when I went to restart one of the physical servers, I noticed the following error messages. I'm not sure if they could be related at all to the problem.
kvm: vcpu1 unhandled rdmsr: 0x140
kvm: vcpu1 unhandled rdmsr: 0x64e
kvm: vcpu1 unhandled rdmsr: 0x34
Also, in order to get more accurate throughput results, I used a traffic generator to test throughput (TCP) against a virtual machine through a SR-IOV interface (rather than VM to VM as I had done before). The results were similar, however. Sometimes I got nearly 9.5 Mbps as expected but other times it dropped to somewhere between 2 and 6 Gbps.
Thanks,
Jorge
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Thank you for waiting for an update and for the additional information that you have shared.
To investigate further, our team needs the following ifnormation:
- Please provide a dump of the dmesg log if possible
- Kindly state the Guest VM configuration (#v CPUs,memory, OS, etc)
- Please also provide the number of VF queues
- Have you done any additional host driver performance duning?
- Have you done any benchmarks direct to the host interface.
We will wait for your reply. Should we not get your reply, we will follow up after 3 business days to check if you need more time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JSasi1,
Please let us know if you need more time to provide the information that we are asking.
Should we not get your response, we will reach out to you after three business days.
Best Regards,
Alfred S
Intel® Customer Support
A Contingent Worker at Intel
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page