Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

event for speculative executed instruction

marina_m_
Beginner
502 Views

Hi All,

We need to measure over Intel computers the instruction executed in a speculative manner, but not commited. We need to measure how many
instructions are discarded (over a period of time), to see how the speculative execution is working. We check the manual with the Performance Monitoring Events (from Intel) but cant figure out which event to monitor. If you could please know where to look for it, or with which name we should look for it.

The intels are:
Intel Xeon CPU E5-2630 0 @ 2.30GHz
Intel Core2 Quad CPU    Q6600  @ 2.40GHz
Intel Core i5 CPU 750  @ 2.67GHz

Thanks so much in advance.
regards
Marina

0 Kudos
1 Reply
McCalpinJohn
Honored Contributor III
502 Views

Unfortunately there are multiple types of "speculation" used in recent Intel processors. You may not be interested in all of them, and you may not be able to separate them using the available mechanisms.

Traditionally, "speculative execution" has referred to executing instructions that are beyond (in program order) a conditional branch.  An out-of-order processor may execute these instructions (along the predicted direction of the branch) before the actual direction of the branch is resolved.  If the branch prediction was incorrect, then the results of the speculatively executed instructions are discarded.

Another form of "speculation" is used in some high-frequency processors to avoid the need to "stall" and "resume" instructions that have data dependencies.  It is clear from performance counter measurements that (as one example) floating-point instructions on recent Intel processors are "executed" speculatively, then "rejected" and "retried" if their input arguments are not ready when the instruction attempts to access them.   Although I have not done extensive testing, it is clear that this cycle of execute/cancel/retry also applies to instruction dispatch to the execution ports as measured by the various options to performance counter event 0xA1 UOPS_DISPATCHED_PORT.PORT_*.  (These events are described in Chapter 19 of Volume 3 of the Intel Architectures Software Developer's Manual.)    This is a frustrating issue to deal with because Intel does describe such a "reject/retry" mechanism in the documentation for any recent processors.  (I think I saw a reference to it for Pentium 4 or earlier processors, but nothing more recent.)

For the Intel processors that used banked L1 Data Caches (for example, the Xeon E5-2630 "Sandy Bridge EP") it appears that the hardware will also dispatch/cancel/retry loads that have bank conflicts with other loads in the L1 Data Cache.  (See the description of performance counter event 0xBF in Table 19-9 of Volume 3 of the Intel Architectures Software Developer's Manual.)

0 Kudos
Reply