Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
5077 Discussions

VPP (fd.io) stops working when using Intel E810 NIC and it works when using Intel X710 NIC

e810user
Beginner
531 Views

Hello, everyone:

 

We built a 3GPP PGW-U/UPF on top of VPP’s plugin system on Ubuntu 22.04.

We have one instance that uses HPE Ethernet 10Gb 2-port 562SFP+ Adapter (Intel X710 chip) on bare metal, and it works.

We have another instance with HPE Ethernet 10Gb 2-port 562SFP+ on VMware (passthrough mode), and it also works.

We have a third instance with Intel E810-XXV-4 on bare metal that stops working some time after starting to receive traffic (it works for a while).

The behaviour after breakdown is:

  • VPP doesn’t process any RX traffic. It doesn’t even respond to ARP.
  • It is able to transmit. At least we’re able to send ARP through the arping plugin, and we see the traffic on the destination. But VPP says it doesn’t receive any ARP responses, even if we see them at the other end. That’s why, if we had to guess, we’d say reception is what breaks down.
  • We can recover the system by bringing down the Gn-U/Gp-U/S5-U/S8-U/N3 interface and bringing it up again. Sometimes it is not enough and we have to do the same with the Gi/SGi/N6 interface.
  • Sometimes it recovers after a while, only to keep working for some time and stop working again.
  • Since we updated the kernel to Ubuntu’s HWE branch (6.5.0-41), vppctl doesn’t reflect the correct state of the interface (you execute “set interface state <interface> down”, no error is shown, execute “show hardware-interfaces”, and the interface is still up; but apparently it is down, because you then execute “set interface state <interface> up” and, some time later, the system begins to respond to ARP again). Before the kernel upgrade, vppctl used to tell the interface was down after “set interface state <interface> down”.

We thought Intel E810 ought to be the culprit (it is the obvious difference among the different instances), so we updated the NIC’s firmware (non-volatile memory) to the latest version (4.50 hex).
And we also upgraded Ubuntu 22.04’s kernel to 6.5.0-41 (HWE) to have an almost state-of-the-art version (we tried further with a mainline kernel version, 6.9.3, but it failed to install, so we though it wasn’t a good idea; not stable enough).

No improvements with any of those actions.

 

Has anyone experienced a similar problem?
HP Gen10 Server, Ubuntu 22.04, Linux kernel 6.5.0-41-generic, VPP 24.02.0-8 and Intel E810 should be a common combo, shouldn’t it?

Before touching the C code of VPP’s basic infrastructure for adding logs or executing step by step on gdb, we were wondering whether there is a better way to pursue this issue.

The fact that the same version of our plugins works on Intel X710 and we can’t upgrade E810 any further makes us hit a dead end.

 

Thank you very much,

 

José Manuel

Labels (1)
0 Kudos
5 Replies
IntelSupport
Community Manager
462 Views

Hello José,

Greetings from Intel!

Thank you for choosing Intel products. We have received your query and kindly ask for some time to consult with our internal team. We will get back to you shortly.

Thank you for your understanding.

Regards,

Amina

Intel Server Support



0 Kudos
Arun_Intel1
Employee
378 Views

Hi E810user,


Thank you for your patience!


Please request dmesg log surrounding the time of the failure.

also share the output of the below command,

"uname -a" out put

and "modinfo ice"


Best Regards

Arun_Intel


0 Kudos
IntelSupport
Community Manager
320 Views

Hello José,

Greetings from Intel!


This is a reminder mail regarding the issue you reported to us. 

 

We're eager to ensure a swift resolution and would appreciate any updates or additional information you can provide. 

Could you please provide us the dmesg log surrounding the time of the failure.

also, the out come of "uname -a" and "modinfo ice".


Please feel free to respond to this email at your earliest convenience. 

 

Best regards, 

Amina 

Intel Server Support 

Intel.com/vroc 



0 Kudos
e810user
Beginner
311 Views

Hi,

 

We're currently struggling for a new problem: the VPP process breaks down after 14 hours routing traffic.
We've gathered a core dump and analysed it with gdb.

I'm having problems to send the reply as text (the form rejects certain characters). So I will send the data as attachments:
  - uname.txt: Result of uname -a.

  - modinfo.txt: Result of modinfo ice.

  - dmesg.txt: Result of dmesg for the latest failed execution.

  - gdb.txt: Result of bt in gdb for the core dump.

Whenever we reproduce the former problem, I will suplement the post.


Thank you very much.

0 Kudos
IntelSupport
Community Manager
273 Views

Hello José,

Greetings from Intel!

Thank you for sharing the output, kindly allow us sometime to discuss it with our internal team. We will get back to you shortly.

Thank you for your understanding.

Regards,

Amina

Intel Server Support


0 Kudos
Reply