Solved: what 's the relationship with page placement , cache partitions ,os ?

kewenpan · ‎04-29-2010

i don't know the exact relationship with virtual page placement,physical page placement,cache partitions ,os?

i have learned that os could control the placement of virtual page to physical page.

os could also improve cache performance through page placement algorithms .i am a bit confused.

os could also control cache partitions?

jimdempseyatthecove · ‎05-02-2010

(depending on cache architecture, and memory management architecture)

Virtual memory is mapped by way of page tables. The memory management architecture will describe the page table layout. For most Intel processors the page table is hierarchical (e.g. 3-level tree). Translation of the virtual address to physical addresscan be made by traversal of this tree. This traversal is expensive (in terms of time). To reduce this expense, and improve performance,the designers of the memory management hardware add a feature to remember recently made page table traversals. These remembrances are stored into what is often called a Translation Look-aside Buffer (TLB). When the current memory request is currently mapped by an entry in the TLB then the page tables need not be consulted. In essence, the TLB is a special page table cache (not labeled as L1, L2, L3, ...). The TLB has a limited number of entries. (See http://en.wikipedia.org/wiki/Translation_lookaside_buffer for a better description)

The TLB may be placed between the instruction encoder and the Ln caches or between the Ln caches and physical memory (this is a design choice). The caching efficiency is not only dependent on the cache size/speed but also on the number of pages mappable by the TLB. If the memory locations are too disperse, you run out of TLB entries and the TLB must get updated. When the programmer organizes the data (and code) such that the working set consumes the fewest number of pages, then the demands on the TLB are reduced.

Jim Dempsey

View solution in original post

jimdempseyatthecove · ‎05-02-2010

(depending on cache architecture, and memory management architecture)

Virtual memory is mapped by way of page tables. The memory management architecture will describe the page table layout. For most Intel processors the page table is hierarchical (e.g. 3-level tree). Translation of the virtual address to physical addresscan be made by traversal of this tree. This traversal is expensive (in terms of time). To reduce this expense, and improve performance,the designers of the memory management hardware add a feature to remember recently made page table traversals. These remembrances are stored into what is often called a Translation Look-aside Buffer (TLB). When the current memory request is currently mapped by an entry in the TLB then the page tables need not be consulted. In essence, the TLB is a special page table cache (not labeled as L1, L2, L3, ...). The TLB has a limited number of entries. (See http://en.wikipedia.org/wiki/Translation_lookaside_buffer for a better description)

The TLB may be placed between the instruction encoder and the Ln caches or between the Ln caches and physical memory (this is a design choice). The caching efficiency is not only dependent on the cache size/speed but also on the number of pages mappable by the TLB. If the memory locations are too disperse, you run out of TLB entries and the TLB must get updated. When the programmer organizes the data (and code) such that the working set consumes the fewest number of pages, then the demands on the TLB are reduced.

Jim Dempsey

Grant_H_Intel · ‎05-05-2010

i don't know the exact relationship with virtual page placement,physical page placement,cache partitions ,os? i have learned that os could control the placement of virtual page to physical page.

Since you are posting this in the context of threading on Intel Parallel architectures, are you talking about local vs. non-local memory page placement on Nehalem? If so, then consult your OS documentation about "NUMA" (Non-uniform memory access) or page placement policies. A common paradigm is "first-touch" where the physical page is allocated on thenode/socket/chip where the memory is first written.

os could also improve cache performance through page placement algorithms .i am a bit confused.

In the case of NUMA architectures, the OS (and user application) can improve memory performance generally through page placement algorithms. Using the page placement policy, the home memory location is determined for each virtual page which affects all subsequent accesses to that virtual page. I am not aware of OS controls available to users that affect the cache mapping/partitioning algorithms. Thesepolicies are oftendetermined mostly by the hardware (see Jim's post above).