Xeon gold 6148 is 20/40 core processor. My measurements of execution time for parallel multi-threaded applications is showing unusual spikes for certain numbers of threads(pinned to particular cores). I am looking for the block diagram of processor die. It should be based on XCC SOC die. I would also like to know how cores are numbered (left to right or top to bottom). Additionally, an estimate of typical core size (in mm) and typical propagation delays.
Intel does not publish the methodology used to number the cores or the methodology used to number the L3/CHA slices (which are numbered independently of the cores).
There is a register in PCI configuration space that provides a bit map of the enabled L3/CHA slices, but there is no key to tell you which location on the die each bit corresponds to. On my Xeon Gold 6148, both sockets have the same pattern of enabled slices:
# setpci -s 17:1e.3 0x9c.l
# setpci -s 85:1e.3 0x9c.l
Using performance counter measurements, I was able to determine how the core and L3/CHA numbers mapped to the physical layout of the chip for my Xeon Platinum 8160 (24-core) nodes. My guess is that the interpretation of the L3/CHA bit map is the same on all XCC dies, and that only the mapping of core numbers is different. My presentation on this topic is at
the mapping of L3/CHA and core numbers to die locations is in slides 11-17.