From the documentation, DDIO can only work at local socket and rdma read/write can only access memory directly though UPI/QPI.
I do an experiment on it. The basic idea is that:
At local socket, rdma read will access memory directly. After a rdma write of the same address, the data will be allocated in LLC and the next rdma read will access data in LLC, which takes less latency. We can see a difference between the latency of two read operations.
At remote socket, rdma read and read after write bothhave to access memory directly and there will not be much difference of the read latency.
By experimentation, local socket rdma read obeys our thought indeed. But read latency of remote socket still shows the difference where we think that DDIO should not work.
My question is that:
1. Does DDIO work for remote socket?
2. If not, what makes the differences of the latency of read and that of read_after_write?