Software Archive
Read-only legacy content
17060 Discussions

Calculating prefetches that missed L2

Tim_D_1
Beginner
2,149 Views

Hi,

 

I am currently doing some performance tests on some offload code for Xeon Phi. I have been calculating performance numbers by measuring hardware counters using PAPI, with the calculation methods explained here:

 

https://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-2-understanding

 

However, in the memory bandwidth section (5.4), the guide says to use an event named HWP_L2MISS to count the number of hardware prefetches that missed L2, which is provided in VTune apparently - although it does not appear to be an actual event according to the list of available events for the PMU document here:

 

https://software.intel.com/sites/default/files/forum/278102/intelr-xeon-phitm-pmu-rev1.01.pdf

 

I assume it is some derived metric VTune works out for you - however I was wondering if anyone knows how it should be calculated? Could I add the number of prefetch0 and prefetch1 requests missed by L2 as provided by counters L2_DATA_PF1_MISS & L2_DATA_PF2_MISS or is there more to it?

 

Thanks,

 

Tim

0 Kudos
8 Replies
Loc_N_Intel
Employee
2,149 Views

Hi Tim,

Let me ask the experts here and get back to you. Thank you.

0 Kudos
McCalpinJohn
Honored Contributor III
2,149 Views

Although they are not documented in the Intel Xeon Phi Performance Monitoring Units guide (document 327357-001), Intel's VTune includes performance monitor events that appear to be what you are looking for:

Event 0xC3, Umask 0x10: HWP_L2HIT : Hardware Prefetch L2 HIT
Event 0xC4, Umask 0x10: HWP_L2MISS : Hardware Prefetch L2 MISS

The VTune "knc_db.txt" file indicates that all of the events using Umask 0x10 should use counter 0 only, but I don't see that indicated anywhere in the documentation.

0 Kudos
Sumedh_N_Intel
Employee
2,149 Views

Hi Tim, 

HWP_L2MISS is an actual PMU event. I can see this in the list events in Intel VTune amplifier XE when I try to configure a custom analysis. 

Thanks, 

Sumedh

 

 

0 Kudos
Tim_D_1
Beginner
2,149 Views

Thanks for the assistance guys - especially the knc_db.txt file mentioned, i found that file in the VTune installation directory and it answered a fair few of my questions. 

Although I note that the event John mentioned:

Event 0xC3, Umask 0x10: HWP_L2HIT : Hardware Prefetch L2 HIT

Does not seem available in VTune, or appear in the knc_db txt file I have.

 

Just to note for anyone else, some of the events available in VTune are not available through PAPI (unlisted in PAPI_NATIVE_AVAIL)- for example:

HWP_L2MISS
L2_STRONGLY_ORDERED_STREAMING_VSTORES_MISS
L2_WEAKLY_ORDERED_STREAMING_VSTORE_MISS

 

0 Kudos
Patrick_S_
New Contributor I
2,149 Views

Tim D. wrote:

Does not seem available in VTune, or appear in the knc_db txt file I have.

 

It is available. Just add

-knob event-config=HWP_L2MISS:sa=1000003

too your vtune command line script.

Mostly I use the following command:

amplxe-cl -collect-with runsa-knc -knob event-config=BRANCHES:sa=1000003,BRANCHES_MISPREDICTED:sa=1000003,CPU_CLK_UNHALTED:sa=10000000,DATA_CACHE_LINES_WRITTEN_BACK:sa=1000003,DATA_PAGE_WALK:sa=1000003,EEC_STAGE_CYCLES:sa=10000000,HWP_L2MISS:sa=1000003,INSTRUCTIONS_EXECUTED:sa=10000000,L2_READ_HIT_E:sa=1000003,L2_READ_HIT_M:sa=1000003,L2_READ_HIT_S:sa=1000003,L2_RED_MISS:sa=1000003,L2_WRITE_HIT:sa=1000003,LONG_DATA_PAGE_WALK:sa=1000003,VPU_INSTRUCTIONS_EXECUTED:sa=1000003

0 Kudos
Tim_D_1
Beginner
2,149 Views

I was referring to the event HWP_L2HIT mentioned by John rather than HWP_L2MISS, I am not actually concerned with monitoring HWP_L2HIT at the moment I was simply commenting that I did not see this event in the custom analysis event menu, nor in the knc_db file john referenced.

 

Thanks for the example of the command you use though, this is useful

0 Kudos
Surya_Narayanan_N_
2,149 Views

I tried reading this HWP_L2HIT and HWP_L2MISS and it was showing "0" in all cores.  How shall I verify whether is it due to my HWP on/off?

0 Kudos
McCalpinJohn
Honored Contributor III
2,149 Views

That looks like the wrong event --- the HWP_L2_MISS event is Event 0xC4, not 0x03.

I definitely get non-zero counts for HWP_L2_HIT.  I am not sure if they make sense yet -- that will take a lot more experimenting....

0 Kudos
Reply