- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Setup
I am testing worst case latency in a setup where I have two E810-XXV direct connected. I am on a Debian 12 standard 6.1.0-26-amd64 Linux kernel.
I have upgraded to the latest ice and irdma drivers:
[ 154.144293] ice: Intel(R) Ethernet Connection E800 Series Linux Driver - version 1.15.5
[ 154.144377] ice: Copyright (C) 2018-2024 Intel Corporation
[ 154.227153] ice 0000:04:00.0: fw 7.7.3 api 1.7.11 nvm 4.70 0x8001f7bb 1.3755.0 [8086:159b] [8086:0003]
...
[ 155.655299] irdma driver version: 1.15.15
Findings
My code sets up a UD queue pair using a socket to share information, and times how long a round trip takes using IBV_WR_SEND and then waiting for a response from the other side. I noticed huge jumps in some RTTs (25+ ms) and isolated where the delay was coming from. I simplified my code some more so the stream of data is just going one way, and implemented the test as follows:
LOOP:
1) Start timer
2) Post send request:
3) Poll send completion:
4) End Timer
With this simplified test, I still got unexpected long latencies, specifically at the send completion polling. The receiver side also detects the slow packet, so the poll completion isn't just taking a long time to return, the packet is getting stuck on the way out.
Data and Explanation
[ 8751.362085] ib_client: WARN - It took longer than 27125379 ns to wait for send, count: 8388610
[ 8803.219365] ib_client: WARN - It took longer than 25279562 ns to wait for send, count: 25165826
[ 8855.077630] ib_client: WARN - It took longer than 16410736 ns to wait for send, count: 41943042
[ 8906.934875] ib_client: WARN - It took longer than 13447023 ns to wait for send, count: 58720258
[ 8958.793107] ib_client: WARN - It took longer than 29723372 ns to wait for send, count: 75497474
[ 9010.650323] ib_client: WARN - It took longer than 29989750 ns to wait for send, count: 92274690
[ 9062.508529] ib_client: WARN - It took longer than 28589231 ns to wait for send, count: 109051906
[ 9114.364223] ib_client: WARN - It took longer than 29764279 ns to wait for send, count: 125829122
[ 9166.221406] ib_client: WARN - It took longer than 29413178 ns to wait for send, count: 142606338
[ 9168.820274] min: 2117 ns, max: 29989750 ns, mean: 3063 ns
[ 9168.820276] <3000 : 71619390
[ 9168.820276] [3000, 4000) : 71646397
[ 9168.820277] [4000, 5000) : 176602
[ 9168.820278] [5000, 5500) : 9
[ 9168.820278] [5500, 6000) : 154
[ 9168.820279] [6000, 6500) : 134
[ 9168.820279] [6500, 7000) : 516
[ 9168.820280] [7000, 7500) : 293
[ 9168.820280] [7500, 8000) : 917
[ 9168.820281] [8000, 9000) : 1075
[ 9168.820281] [9000, 10000) : 1139
[ 9168.820282] [10000, 15000) : 1031
[ 9168.820282] [15000, 20000) : 0
[ 9168.820283] [20000, 25000) : 0
[ 9168.820283] [25000, 30000) : 0
[ 9168.820284] [30000, 50000) : 0
[ 9168.820284] [50000, 100000) : 0
[ 9168.820285] [100000, 200000) : 0
[ 9168.820285] [200000, 300000) : 0
[ 9168.820286] [300000, 500000) : 0
[ 9168.820286] [500000, 1000000) : 0
[ 9168.820287] [1000000, 2000000) : 0
[ 9168.820287] [2000000, 3000000) : 0
[ 9168.820288] [3000000, 4000000) : 0
[ 9168.820288] >4000000 : 9
The above data is from a test that uses the kernel-space ib_* interfaces and shows a log for each slow completion poll and a histogram of all completions. The count field in the first logs show what message count we are at (1 indexed), so on the 8388610th message we see our first very long completion. The pattern that follows is every 16777216 (2^24) messages after the initial delay we get another long completion time. This is not a time based slowdown, if I slow down my send loop that same message numbers have the slowdown.
8388610 is also suspiciously close to 2^22/2.
My only idea is that this may be related to the sq_psn (It is a 24 bit field) that you need to set in your UD queue pair, however changing this value to 0 or any random value does not change when the issue first shows up, 8388610th message or 2^24 messages after.
I have also replicated my issue using the userspace ibverbs interface by adding some timing instrumentation to the ud_pingpong.c example in rdma core. The same issue is present, with long latencies happening at the same message counts {8388610, 25165826, 41943042, ...}
These are not scheduler interruptions, as my kernel space implementation does not get interrupted and the user-space testing is on isolated cores where scheduler interruptions have been reduced to 60 us. Also if it was some sort of preemption, it would not always happen at the same message.
Other Tests/Question
I have another version of this RTT test that uses a RC queue pair type and it does not suffer from the same issue.
Any other ideas at to what could be causing these long completions ?
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter
Greetings!
Apologies for the delayed response.
Kindly let us know if you still require assistance with this case.
Please feel free to reply to this email. We're here to assist you every step of the way.
Regards,
Pujeeth
Intel Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That would be nice. It seems pretty clear that the RoCE implementation has a bug.
I have verified this issue does not show up with the same code and Mellanox ConnectX-6 SmartNIC HW.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter
Greetings!
Thank you for the update, in order to proceed further with troubleshooting would request you to share us the below information:
1) Kindly share the driver and firmware version of the NIC card.
2) Kindly provide the system details.
3) Please share front and back pictures of the NIC card, clearly showing the serial number and MM ID markings.
4) Kindly share the SSU logs.
Intel® System Support Utility for the Linux* Operating System
Regards
Pujeeth
Intel customer support technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Thank you for contacting Intel.
This is the first follow-up regarding the issue you reported to us.
We wanted to inquire whether you had the opportunity to review the plan of action (POA) we provided.
Feel free to reply to this email, and we'll be more than happy to assist you further.
Regards,
Pujeeth
Intel Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have, still working on getting the pictures and SSU logs. Should have them in a day or two.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Greetings!
Thank you for the update, kindly keep us posted.
Regards
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Thank you for contacting Intel.
This is the first follow-up regarding the issue you reported to us.
We wanted to inquire whether you had the opportunity to review the plan of action (POA) we provided.
Feel free to reply to this email, and we'll be more than happy to assist you further.
Regards,
Pujeeth
Intel Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Thank you for contacting Intel.
We will proceed to close this case. If you find that you still required assistance, we kindly request you to respond to the case. This will allow us to either reopen the current case or initiate a new one.
Regards,
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm still working on getting the requested information. Because it took a while for you to get back to us, we repurposed the server we were using. We are working on getting another one set up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Thank you for the update, kindly keep us posted.
Regards
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am on a Debian 12 standard 6.1.0-26-amd64 Linux kernel.
I have upgraded to the latest ice and irdma drivers:
[ 154.144293] ice: Intel(R) Ethernet Connection E800 Series Linux Driver - version 1.15.5
[ 154.144377] ice: Copyright (C) 2018-2024 Intel Corporation
[ 154.227153] ice 0000:04:00.0: fw 7.7.3 api 1.7.11 nvm 4.70 0x8001f7bb 1.3755.0 [8086:159b] [8086:0003]
...
[ 155.655299] irdma driver version: 1.15.15
Pictures attached.
SSU logs attached.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter
Greetings!
Thank you for sharing the details. We would like to request that you review the supported operating systems for the 810-XXV.
Supported Operating Systems for Retail Intel® Ethernet Adapters
https://www.intel.com/content/www/us/en/support/articles/000025890/ethernet-products.html
Regards
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter
Greetings!
We wanted to follow up on this case, Please feel free to respond to this email at your earliest convenience.
Regards,
Pujeeth
Intel Customer Support Technician
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did read the support matrix, I'll replicate the issue on Debian 11 if you'd like and report back.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter
Greetings!
Thank you for the update, kindly keep us posted.
Regards
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Thank you for contacting Intel.
This is the first follow-up regarding the issue you reported to us.
Feel free to reply to this email, and we'll be more than happy to assist you further.
Regards,
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have replicated the issue as documented above on a Debian 11 machine (SSU logs attached).
Linux x1cymac 5.10.0-34-amd64 #1 SMP Debian 5.10.234-1 (2025-02-24) x86_64 GNU/Linux
I installed the latest driver, so they are slightly different.
[ 1.637963] ice: Intel(R) Ethernet Connection E800 Series Linux Driver - version 1.16.3
[ 1.718722] ice 0000:02:00.0: fw 7.7.3 api 1.7.11 nvm 4.70 0x8001f7bb 1.3755.0 [8086:159b] [8086:0003]
[ 475.879715] irdma driver version: 1.16.10
Example ibv_ud_pingpong output on the new machine:
bla@x1cymac:~/e810_test/rdma-core$ taskset -c 2 ./build/bin/ibv_ud_pingpong -g 1 -n 100000000
local address: LID 0x0001, QPN 0x000004, PSN 0x215fa1: GID ::ffff:192.168.1.106
Using psn of 0x215fa1
Dest GID = 00:00:00:00:00:00:00:00:00:00:ff:ff:c0:a8:01:67
remote address: LID 0x0000, QPN 0x001199, PSN 0x09c669, GID ::ffff:192.168.1.103
- Long send time: 20009908 ns, index: 8388610 -
- Long send time: 56279375 ns, index: 25165826 -
- Long send time: 52510283 ns, index: 41943042 -
- Long send time: 37830943 ns, index: 58720258 -
- Long send time: 54142370 ns, index: 75497474 -
- Long send time: 43183471 ns, index: 92274690 -
204800000000 bytes in 1300.34 seconds = 1259.98 Mbit/sec
100000000 iters in 1300.34 seconds = 13.00 usec/iter
Max send time was: 56279375 ns at index: 25165826 and last was 4557 nsThe delays are not always the same time, but almost always over 10ms and they happen at the exact same indices.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello latency_hunter,
Thank you for sharing the SSU logs. Upon reviewing the logs, we see that the system is manufactured by Supermicro. Kindly let us know if this adapter came with the system or was purchased separately.
Regards
Pujeeth_Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It was purchased separately.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi latency_hunter,
Thank you for your response.
Would you be able to share with us the NIC card with the label to check the marking?
Along with that, please let us know if you are using the latest driver and firmware as the below link:
Looking forward to your response.
Regrads,
Fikri O.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page