- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The environment: two servers (Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz) connected with Mellanox CX-5. All test are within local socket.
According to my understanding, with DDIO enabled, rdma read will fetch data directly from memory bypassing LLC if the data is not valid in LLC. Rdma write will 'write update' or 'write allocate' into LLC. And once data is valid in LLC by rdma writing, rdma read can access data from LLC.
With DDIO disabled, both rdma read and rdma write cannot access data in LLC. Both read or write to the memory directly.
What I expect: disabling DDIO will increase the latency of rdma write but have little influence on rdma read.
The experiment shows:
Command:
ib_read/write_lat -d mlx5_0 -R address
Result:
latency/us | DDIO enabled | DDIO disabled |
ib_read_lat | 1.773 | 1.897 |
ib_write_lat | 0.933 | 0.938 |
It seems that latency of rdma write is rarely affacted by DDIO. But latency of rdma write increases obviously. The result is just opposite to my understanding.
Here is one more insteresting result:
Read latency : 1.80 us.
Raw write latency : 0.93 us
Write latency after disabling IBV_SEND_INLINE: 1.32us
Write latency after disabling IBV_SEND_INLINE and DDIO: 1.32us
My questions are:
Q1: Is my understanding about rdma and DDIO correct?
Q2: Why does DDIO affect rdma read instead of rdma write, which really confuses me?
Q3: Why latency of rdma read and that of rdma write is not equal after disabling IBV_SEND_INLINE and DDIO? Is there any other functions of processor or rdma or driver that improve the performance of write?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello oleotiger,
Thank you for contacting Intel Customer Support.
We do recommend checking the following url about Intel® Data Direct I/O Technology (Intel® DDIO):
A Primer
If this document does not have the information that you need please let us know so we can keep searching on our end.
Best regards,
Sergio S.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello oleotiger,
We are following your case and would like to know if you need more assistance.
Best regards,
Sergio S.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have read the document and it helps me with a better understanding of DDIO. But the experiment result still conflicts with the description of DDIO.
As document shows, a network read with DDIO enabled and data invalid in LLC will trigger memory load and will trigger cache read with data valid in LLC.
I do rdma write--read--write-read of the same address within a cacheline. The cache or memory access of each read or write operation can be summarized:
Access properity | Client | Server | |||
op to data | access target | op to data | access target | ||
DDIO ON | first write | read | memory | write allocate | cache |
first read | write allocate | cache | read | cache | |
second write | read update | cache | write update | cache | |
second read | write | cache | read | cache | |
DDIO OFF | first write | read | memory | write | memory |
first read | write | memory | read | memory | |
second write | read | memory | write | memory | |
second read | write | memory | read | memory |
So with DDIO on the second write should have less latency than that of the first write. The latency of first read and that of second read should be similar.
Q1: Why does experiment show that second read still has less latency than first read with DDIO on?
With DDIO disabled, both first and second write should have similar latency since they all access memory at both side.
Q2: Why does experiment show that first write has less latency than second write?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi oleotiger,
I am also studying the relationship between DDIO and RDMA. I test the performance of RDMA using Perftest tool. But I find it is hard to operate the same address in cacheline or memory by only using Perftest. So, how do you "do rdma write--read--write-read of the same address within a cacheline" ? Do you write a Verb application? And how do you restrict the operation to the specific cacheline?
I really appreciate for your reply!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi oleotiger,
I am interested in your test cases. Could you share your verbs tools for deep analysis?
There are some advices and questions:
1. For Q1, we should consider the impact of SQ\CQ during the test. The operations of reading wqes and writing cqes may cache hits in the second operation. Do you use the same QP for both read and write tests.
2. The ib_***_lat test may loops for n iterations. One-shot test can be accidental. Is there any tricks to get the accurate test data?
3. For Q2, if the data is accurate, I also want to know the nature of phenomena that the test data set in Q2 is more discrete than Q1.
I really appreciate for your reply!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello oleotiger,
We appreciate the additional information, please allow us to check it and we will get back to you.
Best regards,
Sergio S.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, SergioS. Has there been some progress?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello oleotiger,
Thank you for waiting for our updates, unfortunately the information that you need can not be disclosed publicly. We can only share information that is publicly available on our public Intel website.
We do apologize for this inconvenience.
Best regards,
Sergio S.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page