Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Topology of Cascade Lake

aozcan
New Contributor I
521 Views

Hi,

 

I am trying to figure out the topology of my Cascade Lake server. For starters, I am now trying to find out disabled tiles. Finding out CHA numbering is trivial once I learn about disabled tiles. To do this, I read CAPID6 register. Relevant link on how to do this: https://community.intel.com/t5/Software-Tuning-Performance/Understanding-PCICFG-space-information/td-p/1138820

 

lspci | grep :1e.3
16:1e.3 System peripheral: Intel Corporation Sky Lake-E PCU Registers (rev 07)


sudo setpci -s 16:1e.3 0x9c.l
06ffff77

 

 

 

06ffff77 in hex equals to 0110111111111111111101110111 in binary

 

aozcan_0-1654678175296.jpeg

So, my conclusion was that the topology would look like this. However when I inspected lscpu command output I get:

 

lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 1
Core(s) per socket: 12
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6256 CPU @ 3.60GHz
Stepping: 7
CPU MHz: 4299.829
CPU max MHz: 4500.0000
CPU min MHz: 1200.0000
BogoMIPS: 7200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 33792K
NUMA node0 CPU(s): 0-11

 

Looking at the topology, it looks like I would have 24 cores, but in reality I only have 12 cores. What is wrong with my approach? Would like to hear your thoughts on this @McCalpinJohn 

 

Thanks and best regards

0 Kudos
1 Reply
McCalpinJohn
Black Belt
503 Views

The CAPID6 register is a map of the enabled CHA/SF/LLC slices, not a map of the enabled cores.

The Xeon Gold 6256 processor has 33 MiB of L3 cache, which corresponds to 24 slices of 1.375MiB each.

In my testing across a lot of platforms, I have never seen cores enabled at a mesh location that did not have an enabled CHA/SF/LLC slice, so on your processor there should 24 enabled CHA/SF/LLC slices, 12 of which also have enabled cores.  

The methodology described in http://dx.doi.org/10.26153/tsw/13119 can be used to determine the CHA/SF/LLC slice number co-located with each core.  

0 Kudos
Reply