Community
cancel
Showing results for 
Search instead for 
Did you mean: 
idata
Community Manager
1,017 Views

SR-IOV on NIC 82599: cannot dom0 cannot ping local vm (and vice-versa)

Hello all,

I have a server with a 82599 NIC with 2 ports, I have created 31 VF's per port, 62 in total (for some reason I can't get more than this number, but that's not the problem). From each of these port I pick 1 VF that I assign to my guest VM.

At the guest VM level, I create a bonding, vlan interface and that works well when i try to get in or get out from the outside or from the dom0. No problem.

Okay, now i need to change my MTU to 9K, so i figured that I'd rather install the latest ixgbe and ixgbevf drivers (on dom0 and the vm respectively). Before changing the MTU, everything still works pretty well. Now, I change the MTU on the guest VM, unfortunately i had to also change the MTU of the dom0 PF to at least the same size, so 9K. My first question is: why would i have to change the PF MTU size since the VF should be independant?

Anywyas, I change the PF as well as the VF, and I can ping any combination: local vm to local vm; local vm to remote vm; local vm to remote server; etc... but there is one combination that doesn't work: local vm to / from dom0 !

I checked the arp table and sniffed the frames/packets and this is what i observed when vm tried to ping local dom0:

preliminary arp request is sent, vm1 says:

17:14:54.639029 arp who-has dom0-ip-address tell vm1-ip-address

 

Dom0 sees that request and replies:

17:16:19.884462 arp who-has dom0-ip-address tell vm1-ip-address

17:16:19.884489 arp reply dom0-ip-address is-at 00:1b:21:d9:64:4c (oui Unknown)

 

meanwhile, dom0 received the rquest so it updates its arp table:

[root@dom0 ~]# arp -a

? (vm1-ip-address) at 72:b5:06:8f:b4:ae [ether] on sp

However, the reply seems to never reach the vm1, its arp table remains incomplete:

[root@vm1 ~]# arp -a

? (dom0-ip-address) at on bond0

Trying the other way around gives the same result. Forcing the mac address doesn't help.

Any idea how to make this work?

And more importantly: is MTU > 1500 even supported for this 82599 NIC?

Thanks

Ray.

0 Kudos
5 Replies
Patrick_K_Intel1
Employee
123 Views

Thanx for posting to our site.

First let me address the changing of the MTU size. On that particular device, the MTU for the PF and the VF's must be the same. In fact you should be able to change the MTU for the PF, nothing will actually happen when you try and change it on the VF.

Not being able to create > 32 VF's sounds a bit odd. I would update your BIOS.

As for Dom0 communicating with a VF - can you try it without doing any VLAN and bonding in the VM and let me know what you find.

thanx,

Patrick

idata
Community Manager
123 Views

Hello Patrick,

Thanks for your answer, much appreciated.

About the MTU size, it's a problem for me to set dom0 PF's at 9000 because that requires to change all our virtual bridges to 9000 as well, and for some reason the vlaned netfront interface don't work anymore.

Anyways, re focusing on the dom0 not being able to communicate with the local vm, i broke the bond, used a non vlan network, still the exact same issue.

Any other idea?

Thanks,

Ray

Patrick_K_Intel1
Employee
123 Views

Need a bit more information in order to tackle your problem. Can you tell me the details on what OS you are using (version,kernel etc.) along with what exact drivers you are using, and how you got them and how you are compiling them.

thanx,

Patrick

idata
Community Manager
123 Views

Hi Patrick,

I kind of solved my issue, but not fully, let me explain.

First, here are the versions i use:

OVS303 with default kernel 2.6.32.21-45xen and latest intel ixgbe drivers (well almost ) 3.12.6 (i created them with rpmbuild).

Guest OS is OEL6.3 kernel 2.6.32-400 with latest ixgbevf drivers 2.7.12

In our architecture we have a bond made of 2 PFs, and each PF has 31 VF. I assign one VF of each PF to my system and recreate the bond. Each PF is connected to different switches.

Anyways, when i look at the supposedly inactive switch, its table contains the mac address of my VF !

I just failed over the port to the active switch and it worked.

I'm wondering if SR-IOV is asking my switch to failover sometimes...

When I look at the dom0, i find these messages: "VF Reset msg received from vf" anytime i bring up the vm.

maybe that's related...

So, i know what's the problem, but i dont know the source of the issue... what is causing the bonding to switch VF...

Any thoughts?

Patrick_K_Intel1
Employee
123 Views

Aaah, that piece of information is important

Bonding of PF's has some special considerations that must be taken into account. I suggest you take a look at a paper I wrote last year:

/community/wired/blog/2012/06/25/latest-flexible-port-partitioning-paper-is-now-available-learn-about-qos-and-sr-iov http://communities.intel.com/community/wired/blog/2012/06/25/latest-flexible-port-partitioning-paper...

That should shed some light on your challenges.

- Patrick

Reply