I read in the SDM that one can assign a memory type on a page-by-page basis. For example, suppose we specify a page as Uncacheable. When we read/write from/to this page, will the L1 cache be bypassed or do all the accesses still go through the cache? For me this is not clear from the documentation.
The architectures I am working on are Sandy Bridge and Skylake.
Memory type is based on a combination of MTRRs and PATs. MTRRs are a limited resource typically applied to large address ranges, while PATs allow the PCD and PWT bits in every page table entry to be used to specify memory type as for that specific page. Table 11-7 in Section 220.127.116.11 of Volume 3 of the Intel Architectures SW Developer's Manual (document 325384, revision 060, September 2016) shows how the MTRR and PAT values are combined to determine the effective memory type.
Loads to pages that are not cacheable for reads (effective types UC and WC) will generate uncached load transactions for the specific operand sizes requested. The requested bits will be brought directly into the target register and no data will be put into the cache. If the cache line containing the address requested is already in the cache, the behavior may depend on whether the effective type came from the MTRRs or from the PATs, as described in the footnotes to Table 11-7.
Stores to pages that are not cacheable for writes (effective types UC, WC, WP, WT) will generate uncached store instructions, but the details differ. If the cache line containing the address requested is already in the cache, the behavior again depends on whether the effective type came from the MTRRs or the PATs, as well as which of the four effective types is used for the new store access.
Some combinations of previous state, new state, and transaction type are sufficiently inconsistent that they could generate a protocol error, so the SW Developer's Manual documents certain procedures that must be followed when changing cache attributes, e.g., Section 11.11.8 provides the sequence of operations required to safely change MTRR values on a multiprocessor system.