Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.
Ankündigungen
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

difference between STALLS_L1D_PENDING and CYCLES_L1D_PENDING

Aditya_D_
Einsteiger
1.512Aufrufe

Hi,

what is the difference between CYCLE_ACTIVITY:CYCLES_L1D_PENDING and CYCLE_ACTIVITY:STALLS_L1D_PENDING events for IVY-BRIDGE processors. Is it that STALLS indicate number of times executing stalled and CYCLES_ indicate the total time in cycles for the stalls?

 

 

 

 


 

0 Kudos
1 Antworten
McCalpinJohn
Geehrter Beitragender III
1.512Aufrufe

CYCLE_ACTIVITY.CYCLES_L1D_PENDING increments in each processor cycle if there is at least one L1 Data Cache load miss outstanding.

CYCLE_ACTIVITY.STALLS_L1D_PENDING increments in each processor cycle if there is *both* at least one L1 Data Cache load miss outstanding *AND* no uops are dispatched to the execution ports.

The latter event is intended to help identify cases in which cache misses are the cause of the stall.  

This should be understood as an *indication*, not as proof that the cache miss(es) actually caused the stall.  (In an out-of-order processor there are too many ambiguous cases -- for example how do you assign "blame" when the processor is stalled for multiple reasons in the same cycle?

I don't think that I have tested this carefully yet, but I expect this event to systematically under-count.  The problem is that the processor does not know if the data is in the cache, so it can "dispatch" memory load uops to the execution port(s) multiple times.  If the data is not in the L1 Data Cache, the uop is rejected and retried later.  The counters cannot distinguish between uops that are dispatched and complete vs uops that are dispatched and rejected, so this event will *not* count cycles in the latter category as stall cycles -- even though most of us would consider that to be a stall cycle for practical purposes.

Antworten