- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, everyone:
We built a 3GPP PGW-U/UPF on top of VPP’s plugin system on Ubuntu 22.04.
We have one instance that uses HPE Ethernet 10Gb 2-port 562SFP+ Adapter (Intel X710 chip) on bare metal, and it works.
We have another instance with HPE Ethernet 10Gb 2-port 562SFP+ on VMware (passthrough mode), and it also works.
We have a third instance with Intel E810-XXV-4 on bare metal that stops working some time after starting to receive traffic (it works for a while).
The behaviour after breakdown is:
- VPP doesn’t process any RX traffic. It doesn’t even respond to ARP.
- It is able to transmit. At least we’re able to send ARP through the arping plugin, and we see the traffic on the destination. But VPP says it doesn’t receive any ARP responses, even if we see them at the other end. That’s why, if we had to guess, we’d say reception is what breaks down.
- We can recover the system by bringing down the Gn-U/Gp-U/S5-U/S8-U/N3 interface and bringing it up again. Sometimes it is not enough and we have to do the same with the Gi/SGi/N6 interface.
- Sometimes it recovers after a while, only to keep working for some time and stop working again.
- Since we updated the kernel to Ubuntu’s HWE branch (6.5.0-41), vppctl doesn’t reflect the correct state of the interface (you execute “set interface state <interface> down”, no error is shown, execute “show hardware-interfaces”, and the interface is still up; but apparently it is down, because you then execute “set interface state <interface> up” and, some time later, the system begins to respond to ARP again). Before the kernel upgrade, vppctl used to tell the interface was down after “set interface state <interface> down”.
We thought Intel E810 ought to be the culprit (it is the obvious difference among the different instances), so we updated the NIC’s firmware (non-volatile memory) to the latest version (4.50 hex).
And we also upgraded Ubuntu 22.04’s kernel to 6.5.0-41 (HWE) to have an almost state-of-the-art version (we tried further with a mainline kernel version, 6.9.3, but it failed to install, so we though it wasn’t a good idea; not stable enough).
No improvements with any of those actions.
Has anyone experienced a similar problem?
HP Gen10 Server, Ubuntu 22.04, Linux kernel 6.5.0-41-generic, VPP 24.02.0-8 and Intel E810 should be a common combo, shouldn’t it?
Before touching the C code of VPP’s basic infrastructure for adding logs or executing step by step on gdb, we were wondering whether there is a better way to pursue this issue.
The fact that the same version of our plugins works on Intel X710 and we can’t upgrade E810 any further makes us hit a dead end.
Thank you very much,
José Manuel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Team,
Thank you for contacting Intel.
This is the third follow-up regarding the reported issue. We're committed to ensuring a swift resolution and would greatly appreciate any updates or additional information you can provide.
As we have not heard back from you, we'll assume the issue has been resolved and will proceed to close the case.
Please feel free to respond to this email at your earliest convenience.
Regards,
Amina
Intel Server Support
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello José,
Greetings from Intel!
Thank you for choosing Intel products. We have received your query and kindly ask for some time to consult with our internal team. We will get back to you shortly.
Thank you for your understanding.
Regards,
Amina
Intel Server Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi E810user,
Thank you for your patience!
Please request dmesg log surrounding the time of the failure.
also share the output of the below command,
"uname -a" out put
and "modinfo ice"
Best Regards
Arun_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello José,
Greetings from Intel!
This is a reminder mail regarding the issue you reported to us.
We're eager to ensure a swift resolution and would appreciate any updates or additional information you can provide.
Could you please provide us the dmesg log surrounding the time of the failure.
also, the out come of "uname -a" and "modinfo ice".
Please feel free to respond to this email at your earliest convenience.
Best regards,
Amina
Intel Server Support
Intel.com/vroc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We're currently struggling for a new problem: the VPP process breaks down after 14 hours routing traffic.
We've gathered a core dump and analysed it with gdb.
I'm having problems to send the reply as text (the form rejects certain characters). So I will send the data as attachments:
- uname.txt: Result of uname -a.
- modinfo.txt: Result of modinfo ice.
- dmesg.txt: Result of dmesg for the latest failed execution.
- gdb.txt: Result of bt in gdb for the core dump.
Whenever we reproduce the former problem, I will suplement the post.
Thank you very much.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello José,
Greetings from Intel!
Thank you for sharing the output, kindly allow us sometime to discuss it with our internal team. We will get back to you shortly.
Thank you for your understanding.
Regards,
Amina
Intel Server Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jose,
Greeting from intel!
Hope you are doing well,
Could you please provide additional information that would be helpful for debugging, specifically regarding the DPDK version in use?
We have noted that the E810 firmware was updated to version 4.5. Can you also confirm if you are using the Ice driver and the default DDP version as listed in the DPDK recommended matching list?
Regards
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Team,
Greetings from Intel!
This is a reminder mail regarding the issue you reported to us.
We're eager to ensure a swift resolution and would appreciate any updates or additional information you can provide.
Please feel free to respond to this email at your earliest convenience.
Best regards,
Amina
Intel Server Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Team,
Thank you for contacting Intel.
This is the third follow-up regarding the reported issue. We're committed to ensuring a swift resolution and would greatly appreciate any updates or additional information you can provide.
As we have not heard back from you, we'll assume the issue has been resolved and will proceed to close the case.
Please feel free to respond to this email at your earliest convenience.
Regards,
Amina
Intel Server Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Amina:
The problem turned from irresponsiveness into process breakdown.
It then disappeared when we changed two unobvious parameters:
- We increased the amount of available huge pages in Ubuntu, way beyond the actual amount used by VPP.
- We reduced the number of worker threads to 4, even if the CPU has plenty of free cores.
Even if the solution is unsettling, we moved to more urgent issues.
Thank you very much,
José Manuel
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page