- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to compare two implementations of a particular function for their performance in terms of cpu time and floating-point instructions-retired. I'd prefer not to use any kind of stochastic sampling, I just want to know how many cycles and how many flops elapsed between point A and point B in my code, where this fragment will be executed many times in a single program run.
Unless I'm mis-reading everything, VTune's sampling is stochastic, either time-based or event-based. Is there a way to make VTune's sampling _exhaustive_, so I get the total # of instructions/flops in a function?
I am including VTuneApi calls at the beginning and end of the function to resume and pause data collection.
Really I'm looking for something very much like PAPI (http://icl.cs.utk.edu/papi/), which doesn't support Windows/P4 machines. I'm hoping VTune can deliver this functionality.
Thanks...
-Dan
-----------------------------------
Dan Morris
dmorris@cs.stanford.edu
http://cs.stanford.edu/~dmorris
-----------------------------------
Unless I'm mis-reading everything, VTune's sampling is stochastic, either time-based or event-based. Is there a way to make VTune's sampling _exhaustive_, so I get the total # of instructions/flops in a function?
I am including VTuneApi calls at the beginning and end of the function to resume and pause data collection.
Really I'm looking for something very much like PAPI (http://icl.cs.utk.edu/papi/), which doesn't support Windows/P4 machines. I'm hoping VTune can deliver this functionality.
Thanks...
-Dan
-----------------------------------
Dan Morris
dmorris@cs.stanford.edu
http://cs.stanford.edu/~dmorris
-----------------------------------
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page