Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
5163 Discussions

VPP (fd.io) stops working when using Intel E810 NIC and it works when using Intel X710 NIC

e810user
Beginner
1,419 Views

Hello, everyone:

 

We built a 3GPP PGW-U/UPF on top of VPP’s plugin system on Ubuntu 22.04.

We have one instance that uses HPE Ethernet 10Gb 2-port 562SFP+ Adapter (Intel X710 chip) on bare metal, and it works.

We have another instance with HPE Ethernet 10Gb 2-port 562SFP+ on VMware (passthrough mode), and it also works.

We have a third instance with Intel E810-XXV-4 on bare metal that stops working some time after starting to receive traffic (it works for a while).

The behaviour after breakdown is:

  • VPP doesn’t process any RX traffic. It doesn’t even respond to ARP.
  • It is able to transmit. At least we’re able to send ARP through the arping plugin, and we see the traffic on the destination. But VPP says it doesn’t receive any ARP responses, even if we see them at the other end. That’s why, if we had to guess, we’d say reception is what breaks down.
  • We can recover the system by bringing down the Gn-U/Gp-U/S5-U/S8-U/N3 interface and bringing it up again. Sometimes it is not enough and we have to do the same with the Gi/SGi/N6 interface.
  • Sometimes it recovers after a while, only to keep working for some time and stop working again.
  • Since we updated the kernel to Ubuntu’s HWE branch (6.5.0-41), vppctl doesn’t reflect the correct state of the interface (you execute “set interface state <interface> down”, no error is shown, execute “show hardware-interfaces”, and the interface is still up; but apparently it is down, because you then execute “set interface state <interface> up” and, some time later, the system begins to respond to ARP again). Before the kernel upgrade, vppctl used to tell the interface was down after “set interface state <interface> down”.

We thought Intel E810 ought to be the culprit (it is the obvious difference among the different instances), so we updated the NIC’s firmware (non-volatile memory) to the latest version (4.50 hex).
And we also upgraded Ubuntu 22.04’s kernel to 6.5.0-41 (HWE) to have an almost state-of-the-art version (we tried further with a mainline kernel version, 6.9.3, but it failed to install, so we though it wasn’t a good idea; not stable enough).

No improvements with any of those actions.

 

Has anyone experienced a similar problem?
HP Gen10 Server, Ubuntu 22.04, Linux kernel 6.5.0-41-generic, VPP 24.02.0-8 and Intel E810 should be a common combo, shouldn’t it?

Before touching the C code of VPP’s basic infrastructure for adding logs or executing step by step on gdb, we were wondering whether there is a better way to pursue this issue.

The fact that the same version of our plugins works on Intel X710 and we can’t upgrade E810 any further makes us hit a dead end.

 

Thank you very much,

 

José Manuel

Labels (1)
0 Kudos
1 Solution
Amina_Sadiya
Employee
394 Views

Hello Team, 

 

Thank you for contacting Intel. 

 

This is the third follow-up regarding the reported issue. We're committed to ensuring a swift resolution and would greatly appreciate any updates or additional information you can provide. 

 

As we have not heard back from you, we'll assume the issue has been resolved and will proceed to close the case. 

 

Please feel free to respond to this email at your earliest convenience. 

 

Regards, 

Amina 

Intel Server Support  

Intel.com/vroc  

 


View solution in original post

0 Kudos
9 Replies
IntelSupport
Community Manager
1,350 Views

Hello José,

Greetings from Intel!

Thank you for choosing Intel products. We have received your query and kindly ask for some time to consult with our internal team. We will get back to you shortly.

Thank you for your understanding.

Regards,

Amina

Intel Server Support



0 Kudos
Arun_Intel1
Employee
1,266 Views

Hi E810user,


Thank you for your patience!


Please request dmesg log surrounding the time of the failure.

also share the output of the below command,

"uname -a" out put

and "modinfo ice"


Best Regards

Arun_Intel


0 Kudos
IntelSupport
Community Manager
1,208 Views

Hello José,

Greetings from Intel!


This is a reminder mail regarding the issue you reported to us. 

 

We're eager to ensure a swift resolution and would appreciate any updates or additional information you can provide. 

Could you please provide us the dmesg log surrounding the time of the failure.

also, the out come of "uname -a" and "modinfo ice".


Please feel free to respond to this email at your earliest convenience. 

 

Best regards, 

Amina 

Intel Server Support 

Intel.com/vroc 



0 Kudos
e810user
Beginner
1,199 Views

Hi,

 

We're currently struggling for a new problem: the VPP process breaks down after 14 hours routing traffic.
We've gathered a core dump and analysed it with gdb.

I'm having problems to send the reply as text (the form rejects certain characters). So I will send the data as attachments:
  - uname.txt: Result of uname -a.

  - modinfo.txt: Result of modinfo ice.

  - dmesg.txt: Result of dmesg for the latest failed execution.

  - gdb.txt: Result of bt in gdb for the core dump.

Whenever we reproduce the former problem, I will suplement the post.


Thank you very much.

0 Kudos
IntelSupport
Community Manager
1,161 Views

Hello José,

Greetings from Intel!

Thank you for sharing the output, kindly allow us sometime to discuss it with our internal team. We will get back to you shortly.

Thank you for your understanding.

Regards,

Amina

Intel Server Support


0 Kudos
pujeeth
Employee
498 Views

Hello Jose,


Greeting from intel!

Hope you are doing well,


Could you please provide additional information that would be helpful for debugging, specifically regarding the DPDK version in use?


We have noted that the E810 firmware was updated to version 4.5. Can you also confirm if you are using the Ice driver and the default DDP version as listed in the DPDK recommended matching list?

https://doc.dpdk.org/guides/nics/ice.html#:~:text=28.3.-,Kernel%20driver%2C%20DDP%20and%20Firmware%20Matching%20List,-It%20is%20highly


Regards

Pujeeth_Intel


0 Kudos
Amina_Sadiya
Employee
453 Views

Hello Team, 

  

Greetings from Intel! 

 

This is a reminder mail regarding the issue you reported to us. 

 

We're eager to ensure a swift resolution and would appreciate any updates or additional information you can provide. 

 

Please feel free to respond to this email at your earliest convenience. 

 

Best regards,  

Amina  

Intel Server Support  

Intel.com/vroc  

 


0 Kudos
Amina_Sadiya
Employee
395 Views

Hello Team, 

 

Thank you for contacting Intel. 

 

This is the third follow-up regarding the reported issue. We're committed to ensuring a swift resolution and would greatly appreciate any updates or additional information you can provide. 

 

As we have not heard back from you, we'll assume the issue has been resolved and will proceed to close the case. 

 

Please feel free to respond to this email at your earliest convenience. 

 

Regards, 

Amina 

Intel Server Support  

Intel.com/vroc  

 


0 Kudos
e810user
Beginner
383 Views

Hi, Amina:

 

The problem turned from irresponsiveness into process breakdown.

It then disappeared when we changed two unobvious parameters:
- We increased the amount of available huge pages in Ubuntu, way beyond the actual amount used by VPP.

- We reduced the number of worker threads to 4, even if the CPU has plenty of free cores.
Even if the solution is unsettling, we moved to more urgent issues.

 

Thank you very much,

José Manuel

0 Kudos
Reply