Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
1685 Discussions

How to discover which caches (L1,L2,L3) are shared by which HW threads (cores) ?

gilvannco
Beginner
660 Views

Hi,

how would you discover the information described in subject on a Linux OS? I think that parsing /proc/cpuinfo can give much of the information, but not all. For example, if I have a Quad Core processor, then discovering which cores create the pairs that share L2 isn't possible, right?

I assume that in Quad Core processor, HT threads share L1 cache, then two pairs share L2 cache (so there are two distinct L2 cache units), and then all 4 cores share the same L3 cache. Is this correct?

Regards,

Gilian

0 Kudos
4 Replies
Dmitry_Vyukov
Valued Contributor I
660 Views

HT threads definitely share L1.

L3 is shared between all cores.

But AFAIK L2 is private for each core. I.e. 4 cores and 4 L2 caches. (I assume that you mean Core i7 processors, since you mention L3). What you are talking about - 2 L2 caches shared by a pair of cores - was featured on Core Quad (Q6600) processors.

The quick way to verify an assumption is to download CPUZ utility and observe how many L2 caches you have. If it's 4 then they are not shared, if it's 2 then they are shared.

Also take a look at:

http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/

The article describes how to determine cache parameters by an output of CPUID instruction.

TimP
Black Belt
660 Views

The linux application 'irqbalance --debug' and the Intel MPI command cpuinfo attempt to supply cache sharing information:

[tim@tim-blfd bin64]$ ./cpuinfo
Genuine Intel Processor (Intel64)
===== Processor composition =====
Processors(CPUs) : 4
Packages(sockets) : 1
Cores per package : 4
Threads per core : 1
===== Processor identification =====
Processor Thread Id. Core Id. Package Id.
0 0 0 0
1 0 1 0
2 0 2 0
3 0 3 0
===== Placement on packages =====
Package Id. Core Id. Processors
0 0,1,2,3 0,1,2,3
===== Cache sharing =====
Cache Size Processors
L1 32 KB no sharing
L2 256 KB no sharing
L3 8 MB (0,1,2,3)

[tim@tim-blfd bin64]$ /usr/sbin/irqbalance --debug
Package 0: cpu mask is 0000000f (workload 0)
Cache domain 3: cpu mask is 00000008 (workload 0)
CPU number 3 (workload 0)
Cache domain 2: cpu mask is 00000004 (workload 0)
CPU number 2 (workload 0)
Cache domain 1: cpu mask is 00000002 (workload 0)
CPU number 1 (workload 0)
Cache domain 0: cpu mask is 00000001 (workload 0)
CPU number 0 (workload 0)
Interrupt 98 (class ethernet) has workload 8
Interrupt 0 (class timer) has workload 2002
Interrupt 82 (class storage) has workload 22
Interrupt 58 (class storage) has workload 5
Interrupt 177 (class storage) has workload 0
Interrupt 74 (class legacy) has workload 0



---------------------------------------------------------------
IRQ delta is 3062
Package 0: cpu mask is 0000000f (workload 10260)
Cache domain 3: cpu mask is 00000008 (workload 25)
CPU number 3 (workload 24)
Interrupt 98 (ethernet/23)
Interrupt 177 (storage/0)
Cache domain 2: cpu mask is 00000004 (workload 96)
CPU number 2 (workload 0)
Interrupt 82 (storage/95)
Cache domain 1: cpu mask is 00000002 (workload 5)
CPU number 1 (workload 0)
Interrupt 58 (storage/3)
Interrupt 74 (legacy/0)
Cache domain 0: cpu mask is 00000001 (workload 10134)
CPU number 0 (workload 10134)
gilvannco
Beginner
660 Views

Thank you very much for the answers. I've got following output from cpuinfo:

Intel Xeon Processor (Intel64 Harpertown)
===== Processor composition =====
Processors(CPUs) : 8
Packages(sockets) : 2
Cores per package : 4
Threads per core : 1
===== Processor identification =====
Processor Thread Id. Core Id. Package Id.
0 0 0 0
1 0 0 1
2 0 2 0
3 0 2 1
4 0 1 0
5 0 1 1
6 0 3 0
7 0 3 1
===== Placement on packages =====
Package Id. Core Id. Processors
0 0,2,1,3 0,2,4,6
1 0,2,1,3 1,3,5,7
===== Cache sharing =====
Cache Size Processors
L1 32 KB no sharing
L2 6 MB (0,4)(1,5)(2,6)(3,7)

So, it looks like no L3 cache? And four L2 "banks", each shared by two cores?

Regards,

Gilian

TimP
Black Belt
660 Views
Yes, that's the usual layout for Harpertown.
Reply