Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
1624 Discussions

SDM clarification needed - which tables (19-3 , 19-4, 19-9, 19-10, 19-11) apply to 06_3C processors?


Good day -

I am trying to understand which tables in the SDM:
  "Intel 64 and IA32 Architectures Software Developer's Manual, Combined Volumes 1 ... 2 ... 3 & 4" :
  @ :
apply to processors with CPUID DisplayFamily_DisplayModel 06_3CH  , but am not assisted by
any text in the SDM.

 There is considerable ambiguity in the Chapter 19 text about which Performance Event
 tables apply to which processors -  for example, for table 19.3:
 " Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake
There is no text in the SDM about which processor Display_Family:Display_Model(s) this table applies to -
in the preamble to the table on page 3451 it only states:
" The events in Table 19-4 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding
   with the following value: 06_55H .
Is this a typo ? ie. is this text meant to refer to "Table 19-3", not "Table 19-4" ?
I am led to believe that it is, by the text on page 3471:

" Table 19-4. The events in Table 19-4 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding
   with the following values: 06_4EH and 06_5EH. ... The events in Table 19-4 apply to processors with CPUID signature of
   DisplayFamily_DisplayModel encoding with the following values: 06_8EH and 06_9EH.

Table 19-4. Performance Events of the Processor Core Supported by Skylake Microarchitecture and Kaby Lake Microarchitecture
no mention of 06_55H CPUs here - so it is a typo and "Table 19-3" is only for 06_55H CPUs ? Or not ?

I can find no text in the SDM that explicitly states which CPU DisplayFamily_DisplayModel pairs
Table 19-3 applies to - I think this is a major omission.

There only one mention in the text about which performance event tables CPUs with DisplayFamily_DisplayModel == 06_3CH
support , on page 3504:
 " Processors with CPUID signature of DisplayFamily_DisplayModel 06_3CH and 06_45H support performance events listed in
   Table 19-11.

 I assume 06_3CH CPUs are Haswell, so only these tables should apply:

"  Table 19-9. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors
   Table 19-10. Intel TSX Performance Events in Processors Based on Haswell Microarchitecture
   Table 19-11. Uncore Performance Events in the 4th Generation Intel® Core™ Processors

Yet one processor I'm testing with, an i7-4770,  which has CPUID DisplayFamily_DisplayModel == 06_3CH (Haswell) seems to
support these events which are ONLY listed in Table 19-3, (Skylake) :
     Event | Umask | Mnemonic
     00H      01H      INST_RETIRED.ANY
     00H      02H      CPU_CLK_UNHALTED.THREAD
     00H      03H      CPU_CLK_UNHALTED.REF_TSC

Indeed, linux maps the '/sys/bus/event_source/devices/cpu/events/ref-cycles' event to
the last one shown above.

So am I to assume that all events in Table 19-3 also are supported on this CPU ,
since it supports the first three ?

There are many duplicates and differences between Table 19-3 and Table 19-9 -
which events in which table are supported by  06_3CH CPUs ?
The manual does not offer much help in determining this.

It would be most helpful in future if Intel could include a definitive mapping between
CPUID DisplayFamily_DisplayModel values and supported Performance Event
table identifiers in the next edition of the SDM .

In the absence of this , could anyone please suggest a source of information about
precisely which events are meant to be supported by the 06_3CH  CPUs  ?
Are they Skylake or Haswell ?

Thanks in advance for any help / replies , Best Regards,
Jason <>






preceded by text on page

0 Kudos
2 Replies

Or, in short :
   I can't understand why my Haswell CPU with CPUID DisplayFamily_DisplayModel 06_3CH 
   ( whose PMU events should be documented in Tables 19-9 , 19-10, 19-11 ? ) 
   supports the Skylake Table 19-3 Event:
         Event #  |  Umask #  | Mnemonic
               00H      03H           CPU_CLK_UNHALTED.REF_TSC

     This event is documented nowhere else but in the Skylake Table. Is my "i7-4770" CPU some kind of
     Haswell + Skylake hybrid ?

     It silently accepts registration for Skylake events:
              00H      01H           INST_RETIRED.ANY
              00H      02H           CPU_CLK_UNHALTED.THREAD
    but always returns '0' for them.

$  egrep 'cpu family|model|name' /proc/cpuinfo | head -n 3
cpu family    : 6
model        : 60
model name    : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
$ perf stat -e r0300 sleep 1
Performance counter stats for 'sleep 1':
         1,813,186      r0300
       1.000883663 seconds time elapsed

So this event documented ONLY in Skylake table is supported on Haswell . Why ?

Is there a better source of mappings between CPUID DisplayFamily_DisplayModel and precise PMU events supported?





Black Belt

Those events are the "fixed-function" counters, which are normally accessed either using the RDMSR instruction (as described in Table 19-2 of Volume 3 of the Intel Architectures Software Developers Manual) OR by using the RDPMC instruction with special counter values of (1<<30+0), (1<<30+1), and (1<<30+2).   This latter approach is described in the entry for the RDPMC instruction in Volume 2 of the Intel Architectures Software Developer's Manual.  

I admit that I am a bit puzzled by the wording in Table 19-3.   While it is clear that the fixed-function events described above are independent of the programmable counters, Table 19-3 suggests using a programmable counter for the same events -- which clearly cannot leave all of the other programmable counters available.   I suggest ignoring the Event 0x00 encoding of the first three entries in Table 19-3 and use the special performance counter number encoding described above (i.e., counter numbers 1<<30 to 1<<30+2).  If you want to use the programmable counters for these events, use the encodings from Table 19-1, which provides the same event coverage.  For the fixed-function counters, the AnyThread bits are in the IA32_FIXED_CTR_CTRL MSR, instead of in the IA32_PERFEVTSEL* MSRs.

Note that the "reference cycles not halted" event means something slightly different in the fixed-function and programmable events.  The programmable event increments once for each "tick" of the reference clock (while the processor is not halted), but the fixed-function counter increments by P for each "tick" of the reference clock (while the processors is not halted), where "P" is the same multiplier ratio used by the TSC.