I am working on a project that involves manipulating the physical memory addresses assigned to a process in order to reduce cache line eviction on a third level (shared L3) cache. This essentially comes down to partitioning the l3 via software means. For this project I am using a single Xeon X3430 cpu with each process pinned to a separate core.
The design has been implemented, according to conventional memory-to-cacheline mappings (directly translaing the address into a tag-index-block map), but is still showing regular rates of evicitions in the cache.
Are there any variations from a conventional cache mapping that the Nehalem architecture would be using? Or perhaps other properties of the X3430 that would be affecting the l3 cache? I am aware that Sandy Bridge devies up its l3 into core-specific slices, and specifically avoided it because the mapping would be more complex. I have not encountered any such information about Nehalem.
Any insight or suggestions would be appreciated. Thanks for the help,
Pardon, but I seemed to have accidentally posted this in a rather odd place. I would appreciate it if a moderator would be kind enough to move it to a more appropriate location.