Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.
1706 Discussions

SDM clarification needed - which tables (19-3 , 19-4, 19-9, 19-10, 19-11) apply to 06_3C processors?


Good day -

I am trying to understand which tables in the SDM:
  "Intel 64 and IA32 Architectures Software Developer's Manual, Combined Volumes 1 ... 2 ... 3 & 4" :
  @ :
apply to processors with CPUID DisplayFamily_DisplayModel 06_3CH  , but am not assisted by
any text in the SDM.

 There is considerable ambiguity in the Chapter 19 text about which Performance Event
 tables apply to which processors -  for example, for table 19.3:
 " Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake
There is no text in the SDM about which processor Display_Family:Display_Model(s) this table applies to -
in the preamble to the table on page 3451 it only states:
" The events in Table 19-4 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding
   with the following value: 06_55H .
Is this a typo ? ie. is this text meant to refer to "Table 19-3", not "Table 19-4" ?
I am led to believe that it is, by the text on page 3471:

" Table 19-4. The events in Table 19-4 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding
   with the following values: 06_4EH and 06_5EH. ... The events in Table 19-4 apply to processors with CPUID signature of
   DisplayFamily_DisplayModel encoding with the following values: 06_8EH and 06_9EH.

Table 19-4. Performance Events of the Processor Core Supported by Skylake Microarchitecture and Kaby Lake Microarchitecture
no mention of 06_55H CPUs here - so it is a typo and "Table 19-3" is only for 06_55H CPUs ? Or not ?

I can find no text in the SDM that explicitly states which CPU DisplayFamily_DisplayModel pairs
Table 19-3 applies to - I think this is a major omission.

There only one mention in the text about which performance event tables CPUs with DisplayFamily_DisplayModel == 06_3CH
support , on page 3504:
 " Processors with CPUID signature of DisplayFamily_DisplayModel 06_3CH and 06_45H support performance events listed in
   Table 19-11.

 I assume 06_3CH CPUs are Haswell, so only these tables should apply:

"  Table 19-9. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors
   Table 19-10. Intel TSX Performance Events in Processors Based on Haswell Microarchitecture
   Table 19-11. Uncore Performance Events in the 4th Generation Intel® Core™ Processors

Yet one processor I'm testing with, an i7-4770,  which has CPUID DisplayFamily_DisplayModel == 06_3CH (Haswell) seems to
support these events which are ONLY listed in Table 19-3, (Skylake) :
     Event | Umask | Mnemonic
     00H      01H      INST_RETIRED.ANY
     00H      02H      CPU_CLK_UNHALTED.THREAD
     00H      03H      CPU_CLK_UNHALTED.REF_TSC

Indeed, linux maps the '/sys/bus/event_source/devices/cpu/events/ref-cycles' event to
the last one shown above.

So am I to assume that all events in Table 19-3 also are supported on this CPU ,
since it supports the first three ?

There are many duplicates and differences between Table 19-3 and Table 19-9 -
which events in which table are supported by  06_3CH CPUs ?
The manual does not offer much help in determining this.

It would be most helpful in future if Intel could include a definitive mapping between
CPUID DisplayFamily_DisplayModel values and supported Performance Event
table identifiers in the next edition of the SDM .

In the absence of this , could anyone please suggest a source of information about
precisely which events are meant to be supported by the 06_3CH  CPUs  ?
Are they Skylake or Haswell ?

Thanks in advance for any help / replies , Best Regards,
Jason <>






preceded by text on page

0 Kudos
2 Replies

Or, in short :
   I can't understand why my Haswell CPU with CPUID DisplayFamily_DisplayModel 06_3CH 
   ( whose PMU events should be documented in Tables 19-9 , 19-10, 19-11 ? ) 
   supports the Skylake Table 19-3 Event:
         Event #  |  Umask #  | Mnemonic
               00H      03H           CPU_CLK_UNHALTED.REF_TSC

     This event is documented nowhere else but in the Skylake Table. Is my "i7-4770" CPU some kind of
     Haswell + Skylake hybrid ?

     It silently accepts registration for Skylake events:
              00H      01H           INST_RETIRED.ANY
              00H      02H           CPU_CLK_UNHALTED.THREAD
    but always returns '0' for them.

$  egrep 'cpu family|model|name' /proc/cpuinfo | head -n 3
cpu family    : 6
model        : 60
model name    : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
$ perf stat -e r0300 sleep 1
Performance counter stats for 'sleep 1':
         1,813,186      r0300
       1.000883663 seconds time elapsed

So this event documented ONLY in Skylake table is supported on Haswell . Why ?

Is there a better source of mappings between CPUID DisplayFamily_DisplayModel and precise PMU events supported?





0 Kudos
Black Belt

Those events are the "fixed-function" counters, which are normally accessed either using the RDMSR instruction (as described in Table 19-2 of Volume 3 of the Intel Architectures Software Developers Manual) OR by using the RDPMC instruction with special counter values of (1<<30+0), (1<<30+1), and (1<<30+2).   This latter approach is described in the entry for the RDPMC instruction in Volume 2 of the Intel Architectures Software Developer's Manual.  

I admit that I am a bit puzzled by the wording in Table 19-3.   While it is clear that the fixed-function events described above are independent of the programmable counters, Table 19-3 suggests using a programmable counter for the same events -- which clearly cannot leave all of the other programmable counters available.   I suggest ignoring the Event 0x00 encoding of the first three entries in Table 19-3 and use the special performance counter number encoding described above (i.e., counter numbers 1<<30 to 1<<30+2).  If you want to use the programmable counters for these events, use the encodings from Table 19-1, which provides the same event coverage.  For the fixed-function counters, the AnyThread bits are in the IA32_FIXED_CTR_CTRL MSR, instead of in the IA32_PERFEVTSEL* MSRs.

Note that the "reference cycles not halted" event means something slightly different in the fixed-function and programmable events.  The programmable event increments once for each "tick" of the reference clock (while the processor is not halted), but the fixed-function counter increments by P for each "tick" of the reference clock (while the processors is not halted), where "P" is the same multiplier ratio used by the TSC.

0 Kudos