- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am currently doing some performance tests on some offload code for Xeon Phi. I have been calculating performance numbers by measuring hardware counters using PAPI, with the calculation methods explained here:
However, in the memory bandwidth section (5.4), the guide says to use an event named HWP_L2MISS to count the number of hardware prefetches that missed L2, which is provided in VTune apparently - although it does not appear to be an actual event according to the list of available events for the PMU document here:
https://software.intel.com/sites/default/files/forum/278102/intelr-xeon-phitm-pmu-rev1.01.pdf
I assume it is some derived metric VTune works out for you - however I was wondering if anyone knows how it should be calculated? Could I add the number of prefetch0 and prefetch1 requests missed by L2 as provided by counters L2_DATA_PF1_MISS & L2_DATA_PF2_MISS or is there more to it?
Thanks,
Tim
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tim,
Let me ask the experts here and get back to you. Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Although they are not documented in the Intel Xeon Phi Performance Monitoring Units guide (document 327357-001), Intel's VTune includes performance monitor events that appear to be what you are looking for:
Event 0xC3, Umask 0x10: HWP_L2HIT : Hardware Prefetch L2 HIT
Event 0xC4, Umask 0x10: HWP_L2MISS : Hardware Prefetch L2 MISS
The VTune "knc_db.txt" file indicates that all of the events using Umask 0x10 should use counter 0 only, but I don't see that indicated anywhere in the documentation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tim,
HWP_L2MISS is an actual PMU event. I can see this in the list events in Intel VTune amplifier XE when I try to configure a custom analysis.
Thanks,
Sumedh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the assistance guys - especially the knc_db.txt file mentioned, i found that file in the VTune installation directory and it answered a fair few of my questions.
Although I note that the event John mentioned:
Event 0xC3, Umask 0x10: HWP_L2HIT : Hardware Prefetch L2 HIT
Does not seem available in VTune, or appear in the knc_db txt file I have.
Just to note for anyone else, some of the events available in VTune are not available through PAPI (unlisted in PAPI_NATIVE_AVAIL)- for example:
HWP_L2MISS
L2_STRONGLY_ORDERED_STREAMING_VSTORES_MISS
L2_WEAKLY_ORDERED_STREAMING_VSTORE_MISS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim D. wrote:
Does not seem available in VTune, or appear in the knc_db txt file I have.
It is available. Just add
-knob event-config=HWP_L2MISS:sa=1000003
too your vtune command line script.
Mostly I use the following command:
amplxe-cl -collect-with runsa-knc -knob event-config=BRANCHES:sa=1000003,BRANCHES_MISPREDICTED:sa=1000003,CPU_CLK_UNHALTED:sa=10000000,DATA_CACHE_LINES_WRITTEN_BACK:sa=1000003,DATA_PAGE_WALK:sa=1000003,EEC_STAGE_CYCLES:sa=10000000,HWP_L2MISS:sa=1000003,INSTRUCTIONS_EXECUTED:sa=10000000,L2_READ_HIT_E:sa=1000003,L2_READ_HIT_M:sa=1000003,L2_READ_HIT_S:sa=1000003,L2_RED_MISS:sa=1000003,L2_WRITE_HIT:sa=1000003,LONG_DATA_PAGE_WALK:sa=1000003,VPU_INSTRUCTIONS_EXECUTED:sa=1000003
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was referring to the event HWP_L2HIT mentioned by John rather than HWP_L2MISS, I am not actually concerned with monitoring HWP_L2HIT at the moment I was simply commenting that I did not see this event in the custom analysis event menu, nor in the knc_db file john referenced.
Thanks for the example of the command you use though, this is useful
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried reading this HWP_L2HIT and HWP_L2MISS and it was showing "0" in all cores. How shall I verify whether is it due to my HWP on/off?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That looks like the wrong event --- the HWP_L2_MISS event is Event 0xC4, not 0x03.
I definitely get non-zero counts for HWP_L2_HIT. I am not sure if they make sense yet -- that will take a lot more experimenting....
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page