- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to configure networking on an Elkhart Lake development board running Linux but are seeing quite high packet loss when sending UDP packets to the system.
Our test setup is two development boards directly connected with a ~1m Ethernet cable. Testing is with a iPerf3 trying to do 900Mbit/sec in a single direction.
After tuning the `net.core.wmem_max` and `net.core.rmem_max` tunables we are able to reduce somewhat, but it's still higher than I'd expect for a point-to-point network.
# sysctl net.core | sort
net.core.dev_weight = 64
net.core.dev_weight_rx_bias = 1
net.core.dev_weight_tx_bias = 1
net.core.devconf_inherit_init_net = 0
net.core.fb_tunnels_only_for_init_net = 0
net.core.flow_limit_cpu_bitmap = 0
net.core.flow_limit_table_len = 4096
net.core.gro_normal_batch = 8
net.core.high_order_alloc_disable = 0
net.core.max_skb_frags = 17
net.core.message_burst = 10
net.core.message_cost = 5
net.core.netdev_budget = 300
net.core.netdev_budget_usecs = 8000
net.core.netdev_max_backlog = 10000
net.core.netdev_rss_key = 51:7b:4d:6c:af:ad:f9:5c:07:1e:e6:36:1b:81:31:90:a2:aa:45:3e:3e:ad:c0:df:9e:96:d0:0a:43:f9:ec:c6:2b:43:1f:84:27:3b:20:f8:e9:25:66:12:5e:2f:14:85:29:63:18:3b
net.core.netdev_tstamp_prequeue = 1
net.core.netdev_unregister_timeout_secs = 10
net.core.optmem_max = 20480
net.core.rmem_default = 26214400
net.core.rmem_max = 26214400
net.core.rps_sock_flow_entries = 0
net.core.skb_defer_max = 64
net.core.somaxconn = 4096
net.core.tstamp_allow_data = 1
net.core.txrehash = 1
net.core.warnings = 0
net.core.wmem_default = 26214400
net.core.wmem_max = 26214400
The remaining packet loss seems to be related to rx fifo overflows in the NIC.
Changing the driver to only use a single rx queue seems to eliminate the overflows but doesn't it make sense why this would help.
ethtool -L enp0s29f1 rx 1
We have reproduced this behaviour in both our custom Buildroot-based image and with an Ubuntu image.
Is there something obvious I'm missing?
--------------
I did find mention of issue EHL22 in https://cdrdv2-public.intel.com/636674/636674_Intel_Atom_Pentium_Celeron_Public_SpecUpdate_rev2p1.pdf relating to a bug with the TX/RXFIFO size. The incorrect FIFO size is showing up on out hardware, but it's not clear if this is the cause of our issue or unrelated.
I also can't work out where to get mitigation mentioned in that document.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, @sam-bristow-rl:
Thank you for contacting Intel Embedded Community.
We want to address the following questions to understand this situation:
Could you please clarify if this request is related to the Elkhart Lake (EHL) design developed by you, or is an EHL or a Network Interface Card (NIC) or add-in card developed by a third-party company?
Could you please let us know the name of the manufacturer, the part number, and where we can find the information if this request is related to a third-party design?
We are waiting for your answer.
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Carlos,
We have reproduced the issue on two different Elkhart Lake boards from different manufacturers. The main board we have been testing on is the I-PI SMARC Elkhart Lake but we're seeing identical behaviour on the other board too.
Sam
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, @sam-bristow-rl:
Thanks for your update.
Based on the provided information, we need to address the following questions:
Could you please list the Operating Systems (OSs) related to the reported situation?
Is it possible that you can provide the name of the manufacturer and the part number of the other board used to determine the reported condition?
We are waiting for your answer.
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just to clarify, our test setup is two of the i-Pi SMARC boards connected together. No other boards in the system showing the packet loss.
The other board we have tested on is a pre-release sample from a manufacturer who we don't want to publicize at this point.
We have reproduced the issue with Fedora 38, Ubuntu 20.04 LTS, Ubuntu Core 22 (Intel Atom® X6000E Series Processors), and our custom Buildroot based OS image running Linux 6.1.26-rt8 kernel. Ubuntu and Fedora are showing about 10x worse packet loss than the Buildroot image.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, @sam-bristow-rl:
Thanks for your clarification.
You should address your questions stated in this thread as a reference through the channels listed on the following website:
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure why the Ubuntu forums would be more likely to have an answer since it looks like a possible cause it the problem with the Intel PSE's embedded processor (EHL22). We've also reproduced the issues on more than just Ubuntu.
Can you point me to where I can find the workaround mentioned in the Intel errata EHL22 mentioned in the original post?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, @sam-bristow-rl:
Thanks for your reply.
We need to clarify our previous answer.
Reviewing the note of the cited workaround, Intel drivers are suggested to avoid implementing the mitigation. Intel's drivers are not guaranteed to work properly on third-party devices (such as the ones used from your side to reproduce the reported situation) because they are generic.
Due to this fact, we think that you can contact the developer of the OS to clarify this situation, as a first option.
Another option that is not guaranteed to be covered by the workaround note is to request the proper drivers from the developer of the third-party devices that you are using. They can be contacted as a reference through the channels listed on the following website:
https://www.ipi.wiki/community/forum
Or, you can find the Intel drivers considering the advice provided in the first part of this communication using the tool stated on the following website:
https://www.intel.com/content/www/us/en/support/detect.html
Best regards,
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page