KNL - CHA addresses

Khan__Lincon · ‎07-01-2019

Hello everyone,,,

My question is quite simple, is it possible to know what addresses are hashed to a certain CHA in KNL? I have been looking for the function but I have not been able to find it, I guess the function is undocumented.

Best regards..

McCalpinJohn · ‎07-02-2019

Definitely unpublished, and possibly dependent on model number and/or BIOS settings....

Some information is in https://www.ixpug.org/documents/1524216121knl_skx_topology_coherence_2018-03-23.pptx and some in https://www.ixpug.org/components/com_solutionlibrary/assets/documents/1538092216-IXPUG_Fall_Conf_2018_paper_20%20-%20John%20McCalpin.pdf

All of my testing was done on Xeon Phi 7250 processors (68 cores). For this model, all 38 CHAs are active, but I don't know if that is true for other Xeon Phi x200 models.

My code that maps addresses to CHAs does not run very well on KNLs (the SKX version is a based on https://github.com/jdmccalpin/SKX-SF-Conflicts/blob/master/SnoopFilterMapper.c) and I have not been able to figure out why. (My current guess is that the OS interference is high on these systems, and that I could get better results in single-user mode.)

It only takes a few seconds to identify which CHA is used for all 32768 cache line addresses in a 2MiB page, so if you want to do experiments by CHA location it is easiest to just check the mapping of the memory that you have allocated in each test.

It is theoretically possible to reverse engineer the hash function (and the references above include the hash function for the Xeon Platinum 8160 processor), but it is a lot of work. It seems likely that all Intel processors use variants of a single base hash function, with different XOR mask coefficients (and maybe a few other different parameters) that depend on the CHA count. Deriving the "master" hash function would require deriving the specific hash functions for all CHA counts and then looking for a function for which each of the model-specific hash functions is a subset. There is no guarantee that such a function could be found, and it would require a huge amount of work before one could reasonably declare defeat.... Given how easy it is to measure the address to CHA mapping, it is not obvious that there is a lot of benefit in knowing the formula.

BTW, the presentations above only provide the "slice select hash". There is also a "set select hash" that maps addresses to "sets" within each CHA and/or L3 cache slice. Reversing the "set select hash" is much more challenging than reversing the "slice select hash", and I don't know of any successful efforts in this direction. There has been some work in automating finding collisions, but I don't know of anyone who has attempted to gather enough data to derive formulas....