I am working on a platform power optimization for WiFi over Haswell & Windows 8.1. I am trying to find out if for low traffic scenarios, using non-cached memory for the receive buffers reduces platform power consumption. Currently, several data frames are DMA’ed to a cached memory followed by an interrupt. As a result, the core is awoken staring from the beginning of the DMA. By using non-cached memory, the core can stay in low power state until the interrupt. There are some questions that I failed to find answers for:
- When windows allocates non-cached shared memory for the driver, is it from an MTRR or PAT region?
- What is the impact of allocating non-cached memory by the driver but still using snoop enabled DMA to this memory region (as per MTRR or PAT)? I would expect that for MTRR memory region, the core would not be awoken.
- I could not find a way to get statistics such as the number of snooped/non-snooped transactions. Is there a tool that can provide such information for Haswell over Windows 8.1?
I suppose that cached memory could be used more for management frame because of probably higher temporal locality.In the period of low traffic intensity.For example caching a struct members of beacon frame.