When Intel says that DDIO accelerates only local sockets , what does it mean exactly?
When IO device access physical memory on the remote socket, where DMA is terminated (in case of read and in case of write)? Is it in LLC local to the IO device? Or in physical memory of the remote socket?
Dr Bandwidth already asked this question in very details but I didn't find any answer:
"The Intel documentation (both the reference above and other docs) says "Currently, Intel DDIO affects only local sockets". This might mean that (1) the data is always put in the LLC of the socket to which the IO device is attached, or it might mean that (2) the data is put in the LLC of the socket to which the device is attached *if* that socket is the "home" for the addresses being used. In the latter case, PCIe DMA writes to addresses "homed" on the remote socket would be written to memory (on the remote socket) and not put in any LLC."
So which one is true? (1) or (2) ? I use Haswell platform with two Xeon E5-2660 v4.