- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there an Intel API for getting hardware counters from code? I'm talking about something like PAPI where you can start counters at the beginning of a function then stop the counters at the end and read them.
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
Can you provide some details - for what purpose do you need that functionality?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John:
Yes! It's called Intel® Performance Counter Monitor, and is included in the VTune Amplifier XE package! See the "contrib" sub-directory after installation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
BTW, @John E., I think what Vitaly was getting at is, if you help us understand your need, it may be that VTune Amplifier can already address your need. Or, it may be something we will consider for a future release. So, we would appreciate your comments and wish you will with whatever tool you decide to use. :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Vitaly and MrAnderson for your responses. I asked this question because I was told by two people (both with much more experience than I have) that this functionality exists in vtune but I was not able to discover how to use it. We are trying to understand memory usage in a micro benchmark and it seems to me that querying counters would be simpler, less intrusive and more accurate than the sampling approach. Maybe I’m wrong about that — are there disadvantages to the PCM-type approach besides the fact that it requires modifying code?
I have compiled and linked my executable with the PCM object files but it seems I need permissions to execute. I am running on a shared linux benchmarking machine. Do I need to talk to the administrator or is there another way to do this?
Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
Saying "memory usage" do you want to see how much memory allocated by your workload or analyze memory bandwidth? As for the second one you can create custom analysis type based on Advanced Hotspots and select "Analyze memory bandwidth" option, then you should be able to see memory bandwidth read/write overtime data on timeline.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you create a custom analysis type based on the General Exploration type, you can modify any and all of the sample after values. However, there is an easier way, which is to modify the "sampling interval" for the GE type. But, note, increasing the sampling rate is going to introduce more overhead and can therefore cause your results to be less accurate. There is a fine line and you need to walk it carefully when trying to get "more accurate" results.
The real different between PCM and VTune Amplifier's EBS is that PCM does not give you samples of *where* the events are occurring. You just get counter values. That can be good or bad, depending on what you want to do with the data. If what you want to measure is the cache misses for a loop, using PCM is probably a good idea. It will have lower overhead (although VTune Amplifier's EBS overhead is low) and you can focus on code. VTune Amplifier's EBS will help you narrow your focus to potential problem areas by showing you where, in your application, you are experiencing the most cache misses (for example).
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page