Processors
Intel® Processors, Tools, and Utilities
14939 Discussions

How does DDIO affect rdma read and write?

oleotiger
Novice
2,942 Views

The environment: two servers (Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz) connected with Mellanox CX-5. All test are within local socket.

 

According to my understanding, with DDIO enabled, rdma read will fetch data directly from memory bypassing LLC if the data is not valid in LLC. Rdma write will 'write update' or 'write allocate' into LLC. And once data is valid in LLC by rdma writing, rdma read can access data from LLC.

 

With DDIO disabled, both rdma read and rdma write cannot access data in LLC. Both read or write to the memory directly.

 

What I expect: disabling DDIO will increase the latency of rdma write but have little influence on rdma read.


The experiment shows:

Command: 

ib_read/write_lat -d mlx5_0 -R address

Result:

latency/us DDIO enabled DDIO disabled
ib_read_lat 1.773  1.897
ib_write_lat 0.933 0.938

 

It seems that latency of rdma write is rarely affacted by DDIO. But latency of rdma write increases obviously. The result is just opposite to my understanding.

 

Here is one more insteresting result:

Read latency : 1.80 us.

Raw write latency : 0.93 us

Write latency after disabling IBV_SEND_INLINE: 1.32us

Write latency after disabling IBV_SEND_INLINE and DDIO: 1.32us

 

My questions are:

Q1: Is my understanding about rdma and DDIO  correct?

 

Q2: Why does DDIO affect rdma read instead of rdma write, which really confuses me?

 

Q3: Why latency of rdma read and that of rdma write is not equal after disabling IBV_SEND_INLINE and DDIO? Is there any other functions of processor or rdma or driver that improve the performance of write?

 

0 Kudos
7 Replies
SergioS_Intel
Moderator
2,931 Views

Hello oleotiger,


Thank you for contacting Intel Customer Support.

 

We do recommend checking the following url about Intel® Data Direct I/O Technology (Intel® DDIO):

A Primer


https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/data-direct-i-o-technology-brief.pdf


If this document does not have the information that you need please let us know so we can keep searching on our end.


Best regards,

Sergio S.

Intel Customer Support Technician


For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios



0 Kudos
SergioS_Intel
Moderator
2,912 Views

Hello oleotiger,


We are following your case and would like to know if you need more assistance.


Best regards,

Sergio S.

Intel Customer Support Technician


For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios


0 Kudos
oleotiger
Novice
2,893 Views

I have read the document and it helps me with a better understanding of DDIO. But the experiment result still conflicts with the description of DDIO.

 

As document shows, a network read with DDIO enabled and data invalid in LLC will trigger memory load and will trigger cache read with data valid in LLC.

 

I do rdma write--read--write-read of the same address within a cacheline.  The cache or memory access of each read or write operation can be summarized:

Access properity Client Server
  op to data access target op to data access target
DDIO ON first write read memory write allocate cache
first read write allocate cache read cache
second write read update cache write update cache
second read write cache read cache
DDIO OFF first write read memory write  memory
first read write  memory read memory
second write read memory write memory
second read write  memory read  memory

 

So with DDIO on the second write should have less latency than that of the first write. The latency of first read and that of second read should be similar.

oleotiger_0-1642044188759.png

Q1: Why does experiment show that second read still has less latency than first read with DDIO on?


With DDIO disabled,  both first and second write should have similar latency since they all access memory at both side. 

oleotiger_1-1642044371756.png

Q2: Why does experiment show that first write has less latency than second write?

 

 

 

 

 

 

0 Kudos
OMEGA00
Beginner
1,703 Views

Hi oleotiger,

       I am also studying the relationship between DDIO and RDMA. I test the performance of RDMA using Perftest tool. But I find it is hard to operate the same address in cacheline or memory by only using Perftest. So, how do you "do rdma write--read--write-read of the same address within a cacheline" ? Do you write a Verb application? And how do you  restrict the operation to the specific cacheline?

       I really appreciate for your reply!

0 Kudos
SergioS_Intel
Moderator
2,866 Views

Hello oleotiger,


We appreciate the additional information, please allow us to check it and we will get back to you.



Best regards,

Sergio S.

Intel Customer Support Technician

For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios


0 Kudos
oleotiger
Novice
2,793 Views

Hi, SergioS. Has there been some progress?

0 Kudos
SergioS_Intel
Moderator
2,771 Views

Hello oleotiger,


Thank you for waiting for our updates, unfortunately the information that you need can not be disclosed publicly. We can only share information that is publicly available on our public Intel website. 


We do apologize for this inconvenience.


Best regards,

Sergio S.

Intel Customer Support Technician

For firmware updates and troubleshooting tips, visit :https://intel.com/support/serverbios


0 Kudos
Reply