- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is not obvious from the descriptions of the L2 cache sizes using CPUID(4) and CPUID(80000006h) in the software developer's manual volume 2 (325383-063US), why the two inquiry methods would return different sizes.
CPUID(80000006h) returns (eax, ebx, ecx, edx): 0 0 0x1006040 0. Documentation says that bits 31:16 of ecx is L2 cache size in K units, which translates to 256K.
The L2 cache information from CPUID(4) returns (eax, ebx, ecx, edx): 0x7c004143 0x3c0003f 0x3ff 0, which according to (heading page 3-216 Vol 2A.) "INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level" size is computed as:
This Cache Size in Bytes
= (Ways + 1) * (Partitions + 1) * (Line_Size + 1) * (Sets + 1)
= (EBX[31:22] + 1) * (EBX[21:12] + 1) * (EBX[11:0] + 1) * (ECX + 1)
Which I believe given the ebx/ecx values comes to 1048576.
/proc/cpuinfo (Linux) shows:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
stepping : 4
microcode : 0x200004d
cpu MHz : 1000.000
My question is: which method is correct? The 1MB L2 cache size as returned by CPUID(4) seems to match with all public information for this processor/model.
BTW, the discrepancy in L2 sizes only seems to happen with the Skylake Xeon processor that I have access to.
Thank you.
Here is my driver:
#include <stdio.h> #include <stdint.h> int cpuid_ecx(uint32_t eax, uint32_t *res, uint32_t ecx) { asm("cpuid" : "=a"(res[0]), "=b"(res[1]), "=c"(res[2]), "=d"(res[3]) : "a"(eax), "c"(ecx) :); } int main() { uint32_t r[4]; uint32_t ecx = 0; uint32_t csz; cpuid_ecx(0x80000006, r, ecx); printf("%u %#x %#x %#x %#x\n", ecx, r[0], r[1], r[2], r[3]); printf("0x80000006 L2 size=%u\n", r[2] >> 16); while (1) { cpuid_ecx(0x4, r, ecx); printf("%u %#x %#x %#x %#x\n", ecx, r[0], r[1], r[2], r[3]); if (r[0] == 0) break; csz = (r[1] >> 22) + 1; csz *= ((r[1] >> 12) & 0x3ff) + 1; csz *= (r[1] & 0x3ff) + 1; csz *= r[2] + 1; printf("L%u size = %u\n", (r[0] >> 5) & 0x3, csz); ++ecx; } }
- Tags:
- CC++
- Development Tools
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi David,
Thank you for your query. We suggest you to post your query in the below article for peer to peer support:
https://software.intel.com/en-us/articles/intel-sdm#nine-volume
Regards,
Nikky Priya
Intel Developer Zone Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Nikky,
It has been three weeks since I've taken your approach to submit my inquiry to another forum, and there has been no acknowledgment of the question. Is there another forum that I submit the question to, or can you follow up on the issue?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 1 MiB size is correct -- I had not noticed it before, but my Xeon Platinum 8160 processors have the same incorrect information (for both associativity and size) in 0x80000006/ecx.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I should add that the most recent version of Todd Allen's "cpuid" program (http://www.etallen.com/cpuid.html) provides the same interpretations -- cpuid leaf 4 "Deterministic Cache Parameters" is 16-way associative with 1024 sets for 1 MiB, while cpuid leaf 0x80000006/ecx reports 8-way associativity and 256 KiB size.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page