Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4866 Discussions

Why does the Intel 82576 require MSI-X for SR-IOV

idata
Employee
4,273 Views

To all SR-IOV experts,

I want to use the SR-IOV functionality of Intel's PCI Express 82576 Ethernet NIC on a platform which does not support MSI-X interrupts but only MSI and legacy interrupts types of PCI.

Why are both igb and igbvf drivers requiring MSI-X for I/O virtualization?

In the PCI Express "Single Root I/O Virtualization and Sharing Specification" from PCI-SIG is mentioned, that MSI should be sufficient to use SR-IOV in general.

Any reason Intel decided to only use MSI-X?

Thanks,

Mr. Isfort

 

0 Kudos
1 Solution
Patrick_K_Intel1
Employee
2,567 Views

Thanx for visiting the forums.

The Intel SR-IOV Solution requires MSI-X due to the way we chose to architect it. Each VF has it's own set of dedicated resources in hardware, including Tx/Rx Queue's, descriptors and interrupts. Each VF had 3 interrupts assigned to it, for Rx, Tx and for the PF/VF communication. On a 4 port Intel 82576 that could be 4(ports) * (7VFs)*3 =84 interrupts.

If you use one of the Intel 10Gb Ethernet devices, 2 ports would result in up to 378 interrupt vectors. Too many for standard MSI interrupts. If they were to all share the same interrupt - then performance would be severly hampared.

I hope that answers your question.

I am extremely surprised that you have found a server that supports SR-IOV, but not MSI-X interrupts, I have never encountered such a system - would you mind sharing which server and model you have for future reference?

thanx,

Patrick

View solution in original post

0 Kudos
7 Replies
Patrick_K_Intel1
Employee
2,568 Views

Thanx for visiting the forums.

The Intel SR-IOV Solution requires MSI-X due to the way we chose to architect it. Each VF has it's own set of dedicated resources in hardware, including Tx/Rx Queue's, descriptors and interrupts. Each VF had 3 interrupts assigned to it, for Rx, Tx and for the PF/VF communication. On a 4 port Intel 82576 that could be 4(ports) * (7VFs)*3 =84 interrupts.

If you use one of the Intel 10Gb Ethernet devices, 2 ports would result in up to 378 interrupt vectors. Too many for standard MSI interrupts. If they were to all share the same interrupt - then performance would be severly hampared.

I hope that answers your question.

I am extremely surprised that you have found a server that supports SR-IOV, but not MSI-X interrupts, I have never encountered such a system - would you mind sharing which server and model you have for future reference?

thanx,

Patrick

0 Kudos
idata
Employee
2,567 Views

Thank you Patrick for the insightful answer. I think this perfectly closes the topic.

I try to use the Intel 82576EB with a non-Intel Embedded Platform and if you are interested, I may send you further details on a private channel.

For my research it'll be sufficient to have just two virtual functions at one physical port which shouldn't be a problem with MSI interrupts.

As this is a pure software related problem then, i'll try to retrofit MSI SR-IOV usage to the igb and igbvf drivers. You wouldn't have an idea if that is feasible would you?

Greetings,

Mr. Isfort

0 Kudos
Patrick_K_Intel1
Employee
2,567 Views

Sounds interesting. Always like to hear what interesting things SR-IOV is used for!

I have had a customer or two re-work our drivers, which are of course Open Source to use MSI. I think they made them polling too rather than interrupt driven - though has been a while and I don't recall for sure.

The only guidance I can give is my SR-IOV Toolkit, which included a driver companion.

/community/wired/blog/2010/06/09/announcing-the-intel-ethernet-sr-iov-toolkit-v11 http://communities.intel.com/community/wired/blog/2010/06/09/announcing-the-intel-ethernet-sr-iov-toolkit-v11

Download that and the latest source, then go have some fun. Would love to hear from you via private message capability of the blog when you are done and learn about what you are doing.

Best of luck!

- Patrick

0 Kudos
idata
Employee
2,567 Views

Dear Patrick,

I still have one last question regarding the limitations of MSI in SR-IOV in general.

You said that the accumulated interrupt count is too much for MSI, e.g. 4(ports) * (7VFs)*3 =84 interrupts.

However, each PF and each VF has its own unique MSI Capability or MSI-X Capability (according to the SR-IOV Specification 1.1) and therefore each may request for a maximum of 32 interrupts. 32 interrupt vectors per VF should be more than sufficient for the 82576. What I don't understand is now why 2048 interrupt vectors would be a requirement for each VF.

Is there maybe another hardware/software limitation that might prevent the use of MSI like the Interrupt Controller or Intel VT-d?

Or is it simply that Intel just decided to use MSI-X without any incompatibility reasons in mind but something else?

Thanks again and with kind regards,

Mr. Isfort

0 Kudos
Patrick_K_Intel1
Employee
2,567 Views

There is no limitation, it is just the way we chose to implement the functionality. In speaking with the engineer who architect and wrote the drivers, it was simply an easier and cleaner solution to implement.

0 Kudos
idata
Employee
2,567 Views

Hi Patrick,

I seem to have such a board that works with SR-IOV but _seems_ to not support MSI-X (albeit i'm not 100% sure about this).

It's the Intel DQ77MK. I have to use the pci=assign-busses kernel parameter to enable the virtual functions, so it seems that the BIOS isn't written for SR-IOV?

I use the 82576 NIC, and lspci tells me that the bridge the NIC is connected to doesn't have MSI-X capabilities:

00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4) (prog-if 00 [Normal decode])

Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-

Latency: 0, Cache Line Size: 64 bytes

Bus: primary=00, secondary=03, subordinate=03, sec-latency=0

I/O behind bridge: 0000f000-00000fff

Memory behind bridge: fff00000-000fffff

Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff

Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-

BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-

PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-

Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00

DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us

ExtTag- RBE+ FLReset-

DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-

RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-

MaxPayload 128 bytes, MaxReadReq 128 bytes

DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-

LnkCap: Port # 1, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0 <1us, L1 <16us

ClockPM- Surprise- LLActRep+ BwNot-

LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk-

ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

LnkSta: Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-

Slot # 0, PowerLimit 25.000W; Interlock- NoCompl+

SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-

Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-

SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-

Changed: MRL- PresDet- LinkState-

RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-

RootCap: CRSVisible-

RootSta: PME ReqID 0000, PMEStatus- PMEPending-

DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd-

DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-

LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB

Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

Compliance De-emphasis: -6dB

LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-

EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-

Address: 00000000 Data: 0000

Capabilities: [90] Subsystem: Intel Corporation Device 2035

Capabilities: [a0] Power Management version 2

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)

Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

Kernel driver in use: pcieport

Kernel modules: shpchp

However, the linux kernel (xen and non xen alike) tells me this for the physical functions:

[ 1.652766] igb: Intel(R) Gigabit Ethernet Network Driver - version 4.0.1-k

[ 1.652770] igb: Copyright (c) 2007-2012 Intel Corporation.

[ 1.653111] igb 0000:01:00.0: >irq 45 for MSI/MSI-X

[ 1.653118] igb 0000:01:00.0: >irq 46 for MSI/MSI-X

[ 1.653124] igb 0000:01:00.0: >irq 47 for MSI/MSI-X

[ 1.757217] igb 0000:01:00.0: >7 VFs allocated

[ 1.947724] igb 0000:01:00.0: >Intel(R) Gigabit Ethernet Network Connection

[ 1.947809] igb 0000:01:00.0: >Using MSI-X interrupts. 2 rx queue(s), 1 tx queue(s)

That doesn't make sense given the lspci output above.

Furthermore, if I use xen and attach the physical function an HVM linux guest, qemu tells me MSI-X is used for the pass-through.

However, if I pass-through a VF, it tells me only legacy interrupts are available.

It's exactly the same as reported in this thread: http://old-list-archives.xen.org/archives/html/xen-devel/2011-03/msg00063.html [Xen-devel] SR-IOV problems - HVM cannot access network - Xen Source

with the only problem that the suggested solution (acpi=0) does not work for me.

I'm confused by all the mixed signals I'm receiving from my setup.

Is the Q77 MSI-X capable or is it not? I was not able to find an answer yet.

Best regards,

Andre

0 Kudos
Patrick_K_Intel1
Employee
2,567 Views

Hi,

The Intel DQ77MK is a desktop board. SR-IOV is classified as a server technology. It is my guess that the BIOS on the Intel DQ77MK does not support SR-IOV.

The BIOS must support both VT-D (which it does), and it must also support SR-IOV. In most servers you only actually see VT-D option in BIOS, you enable VT-D and you get SR-IOV by default. Hoever on a client system they seem to have a VT-D option, but it does not support SR-IOV.

Not being an expert in either the Desktop or Server boards themselves, I am not 100% sure about this, however I am 92.3% confident I am likely correct. :-)

Hopefully somebody from the desktop group will see this post and reply.

-Patrick

0 Kudos
Reply