Normally I wouldn't need to deal with this question. I would simply specify a number of threads and let them loose on the system. However, I need to pin processes to specific sockets for tests I'm doing.
I have a dual-socket system with 2 E5-2650v4 (Broadwell) processors, each with 12 cores. I would have thought that the first twelve cores (0-11) were on Socket 0 and the other twelve (12-23) were on Socket 1. Results indicate that this assumption is incorrect. Hyperthreading is turned OFF. Looking at the Linux /proc/cpuinfo file, I now am thinking, based on the "physical id" field, that cores 0-5 and 12-17 are on Socket 0 and cores 6-11 and 18-23 are on Socket 1.
Can someone confirm that this "interleaving" of the core numbers is indeed the case on such a system? Or how I can confirm or refute this new assumption? I've done internet searches, but can only find pages detailing the physical numbers or the performance. Everyone else is just launching threads and letting them run as they will on the system for their tests, I guess.
I have a Haswell-based system that does the same thing when HyperThreading is disabled. It is completely perverse, and I have never been able to get a completely clear answer on why it happens. My interpretation is that the numbering scheme was set up by someone who assumed that HyperThreading would always be enabled, and never considered what their code would do if HyperThreading was disabled (or unavailable, as on a small number of models). I was told by the server vendor that their BIOS people blamed the OS for interpreting the ACPI tables as an ordered list, rather than as a set, but I don't understand enough of the BIOS to OS interface to evaluate whether this is a reasonable claim.
On the Haswell system that shows this behavior, enabling HyperThreading puts logical processors 0-11 on the 12 cores of socket 0, 12-23 on the 12 cores of socket 1, 24-35 on the second thread context of the 12 cores of socket 0, and 36-47 on the second thread context of the 12 cores of socket 1. This "block-distributed" core numbering is a reasonable scheme, and I find it the easiest to work with.
What I find more perverse is that about 1/2 of the systems we purchase use the "block-distributed" core numbering scheme and 1/2 use a round-robin numbering scheme (even logical processor numbers on socket 0, odd logical processor numbers on socket 1). Different server models from the same vendor using the same processor have different numbering schemes, and I have never seen a system option that allowed the numbering scheme to be switched. This continues to cause increased support cost and frustration for both the users and the support staff.
I did remember some talk about the core labeling when I was still at Intel. But much of that esoteric knowledge is lost or inaccessible now.
We did confirm my suspicion by cracking open the server box, running CPU intense computations on chosen cores, and then touching the cooling fins on each processor to see which one heated up. Not very efficient, but it got the job done.