Showing results for 
Search instead for 
Did you mean: 

Number of walk cycles more than number of execution cycles

Hi All,

I am measuring number of walk cycles of an application on an Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz machine.  However, the number of walk cycles obtained are more than number of cycles.

54354529590       dtlb_store_misses.walk_pending:u
427005133679    dtlb_load_misses.walk_pending:u
51905087642      dtlb_store_misses.walk_active:u
249519683387    dtlb_load_misses.walk_active:u
283877210858    cycles:u

Am I using wrong counters to measure walk cycles?

Or, these walk cycles also include the walk caused due to prefetcher? In that case how do I measure only the demand walk cycles?

Any hint would be highly appreciated.

Thanks in advance!



0 Kudos
3 Replies
Black Belt

Starting in the SKL processor, there are two Page Table Walkers per core (Intel Optimization Reference Manual section 2.3.3, document 248966-043), and it looks like you are seeing both of them in use most cycles -- averaging 1.5 load miss walks pending plus 0.2 store miss walks pending over the full execution time.

I don't think I have tested this on SKX, but in the past these performance counter events only counted activity due to demand references -- not activity due to the next-page-prefetcher.  

Based on the definitions of these events in Tables 19-6 of the Intel SWDM Volume 3 (document 325384-073), the DTLB_LOAD_MISSES.WALK_ACTIVE event counts cycles in which each least one Page Miss Handler (PMH) is active, while DTLB_LOAD_MISSES.WALK_PENDING increments by the number of PMHs that are active in each cycle.    Your results show:

  • In cycles with at least one PMH handling a load miss, there were an average of 1.71 PMHs active handling loads. (load.walk_pending/load.walk_active)
  • In cycles with at least one PMH handling a store miss, there were an average of 1.05 PMHs active handling stores.  (store.walk_pending/store.walk_active)


  • 88% of cycles had at least one PMH handling a load (load.walk_active.cycles)
  • 18% of cycles had at least one PMH handling a store (store.walk_active/cycles)
  • The combination of these two imply that about 6% of the cycles had to have one PMH busy handling a load and one PMH busy handling a store


Black Belt

With a little more magic middle-school algebra, I think I derived bounds on the breakdown of activity by cycle.   There are six possible categories of activity and only five data items, so bounds are the most one can hope for....

PMH0 activity PMH1 activity % of time with minimum overlap of LD and ST TLB misses % of time with maximum overlap of LD and ST TLB misses
LD LD 62.5%
ST ST 0.9%
LD ST 6.2% 17.4%
LD (idle) 19.2% 8.0%
(idle) LD 11.2% 0.0%
(idle) (idle) 0.0% 11.2%


Is your issue resolved ? Can you share an update on this issue .