Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Misleading wording in latest "Application Note 485"

bcos
Beginner
196 Views
Hi,

In previous versions of "Intel Processor Identification and the CPUID Instruction, Application Note 485", CPUID function 0x00000004 EAX[31:26] is described as "number of cores on this die - 1", which made perfect sense.

In the latest version of "Intel Processor Identification and the CPUID Instruction, Application Note 485" (dated August 2009), CPUID function 0x00000004 EAX[31:26] is described as "number of APIC IDs reserved for the package - 1". This doesn't make any sense, would make it impossible to determine how many logical CPUs there are per core, and conflicts with the actual behaviour of existing CPUs.

I've tested several Intel CPUs and confirmed that the latest version of "Intel Processor Identification and the CPUID Instruction, Application Note 485" is misleading. The value returned for CPUID function 0x00000004 EAX[31:26] is the number of APIC IDs reserved for the first logical CPU within each core (and *not* the total number of APIC IDs reserved for all logical CPUs within the package).

For anyone interested, here's the results from some Intel CPUs

Atom 330 (dual-core with hyper-threading):
CPUID function 0x00000004 EAX[31:26] = 1 (2 APIC IDs reserved for the first logical CPU within each core)
CPUID function 0x00000001 EBX[23:16] = 4 (4 APIC IDs reserved for all logical CPUs within the package)

Core2 Q6600 (quad-core without hyper-threading):
CPUID function 0x00000004 EAX[31:26] = 3 (4 APIC IDs reserved for the first logical CPU within each core)
CPUID function 0x00000001 EBX[23:16] = 4 (4 APIC IDs reserved for all logical CPUs within the package)

Xeon E5520 (quad-core with hyper-threading):
CPUID function 0x00000004 EAX[31:26] = 7 (8 APIC IDs reserved for the first logical CPU within each core)
CPUID function 0x00000001 EBX[23:16] = 16 (16 APIC IDs reserved for all logical CPUs within the package)

Note: For Xeon E5520 there's 8 local APIC IDs reserved for the first logical CPU within each core but only 4 of those APIC IDs are actually used, and there's 16 local APIC IDs reserved for all logical CPUs within the package but only 8 of them are actually used. I think this is the difference (the idea of "reserved but not used APIC IDs") that the recent changes to the documentation was meant to reflect.

Mostly I'm wondering if the wording used in "Application Note 485" can be changed to reflect actual behaviour...


Cheers,

Brendan
0 Kudos
2 Replies
jimdempseyatthecove
Honored Contributor III
196 Views

Brendan,

Thanks for your post.

I was confused with this issue too (may still be confused). Try running your test on a 6 core machine with HT. I think (guess) this will report 7 and 15 (8 and 16 APIC IDs consumed).

The code I developed for QuickThread migrates a thread affinity from HW thread to HW thread re-reading CPUIDs to build a table of affinity bit position to APIC ID and visa versa. Some of the "potential" APIC IDs then map to no affinity bit i.e. were consumed and not used.

Jim Dempsey
0 Kudos
bcos
Beginner
196 Views
Hi,

Thanks for your reply - nice to know I wasn't the only one confused... :)

Unfortunately I don't have a "6-core with HT" CPU here for testing. To be honest, I didn't realize they'd been released yet - 6-core "Dunnington" without HT released, and 6-core "Gulftown" with HT expected in the first half of next year?

I assume that all Nehalem based CPUs do/will report a total of 16 APIC IDs reserved (with <= 16 used) and 8 APIC IDs reserved for the first logical CPU in each core (with <= 8 used); and that a Nehalem with hyper-threading disabled in the BIOS or disabled in the silicon reports the same as a Nehalem that's using hyper-threading.

I'm mostly upating old code that gathers all information about each CPU from various sources (CPUID, and table look-ups on older CPUs where information is missing from CPUID or CPUID isn't supported) and puts the information into a clean/consistant structure, so that later software doesn't need to deal with the horrendous mess that CPUID has become.

Note: To understand what I mean by "horrendous", try writing code that detects cache sizes and types (including how many other CPUs share the cache), that works for a wide variety of CPUs from a wide variety of CPU manufacturers, and takes into account all known CPU errata.


Cheers,

Brendan

0 Kudos
Reply