non-cached memory impact on platform power consumption

Rony_R_Intel · ‎12-24-2013

Hi,

I am working on a platform power optimization for WiFi over Haswell & Windows 8.1. I am trying to find out if for low traffic scenarios, using non-cached memory for the receive buffers reduces platform power consumption. Currently, several data frames are DMA’ed to a cached memory followed by an interrupt. As a result, the core is awoken staring from the beginning of the DMA. By using non-cached memory, the core can stay in low power state until the interrupt. There are some questions that I failed to find answers for:

When windows allocates non-cached shared memory for the driver, is it from an MTRR or PAT region?
What is the impact of allocating non-cached memory by the driver but still using snoop enabled DMA to this memory region (as per MTRR or PAT)? I would expect that for MTRR memory region, the core would not be awoken.
I could not find a way to get statistics such as the number of snooped/non-snooped transactions. Is there a tool that can provide such information for Haswell over Windows 8.1?

Regards,

Rony

SergeyKostrov · ‎01-11-2014

>>...I am working on a platform power optimization for WiFi over Haswell & Windows 8.1. I am trying to find out if for low >>traffic scenarios, using non-cached memory for the receive buffers reduces platform power consumption... In overall, when on a system less resources are used ( memory, processes, threads, etc ) power consumption goes down. Right? So, I would try to verify in a series of tests if smaller number of non-cached buffers ( that is, smaller total amount of allocated non-cached memory ), reduces power consumption. >>...Currently, several data frames are DMA’ed to a cached memory followed by an interrupt. As a result, the core is awoken >>staring from the beginning of the DMA. By using non-cached memory, the core can stay in low power state until the interrupt... I think it is possible to allocate different number of buffers for different power states and intensity of Wi-Fi traffic. When Wi-Fi traffic is low there is no need to allocate to many buffers. It means, some Wi-Fi traffic activity thresholding needs to be added in your software subsystem to allocate different number of buffers for every case.

Bernard · ‎01-12-2014

I suppose that cached memory could be used more for management frame because of probably higher temporal locality.In the period of low traffic intensity.For example caching a struct members of beacon frame.