- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am trying to measure the number of FLOPS my application achieves on Haswell-EP and Broadwell-EP with VTune 2017 in order to figure out how well I utilize the CPU. I found several resources on Internet according to which I should sample certain events like for example SIMD_FP_256.PACKED_DOUBLE. But it seems that this sort of event is supported only on certain CPU architectures. Is there a resource which lists all events I am supposed to sample on Haswell-EP and Broadwell-EP in order to be able to count all FLOPS that the CPU executes?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are no performance counter events for FP arithmetic instructions on Haswell.
The events on Broadwell are described at https://download.01.org/perfmon/BDW/Broadwell_FP_ARITH_INST_V17.json ; Note that these are Floating-Point arithmetic *instruction* counts, not Floating-Point arithmetic *operation* counts, so you will need to scale them using the FlopsMultiplier values provided in the descriptions.
Floating-point arithmetic instruction issue rate is seldom the first-order limiter of performance in recent microprocessors, and the compiler may generate code with significantly different numbers of FP arithmetic operations at different optimization levels if it thinks that this will improve overall performance. This is particularly common for vectorizable divide and square root operations.
But when approached with a bit of caution these counters should be extremely valuable in evaluating the effectiveness of vectorization, differentiating between 32-bit FP and 64-bit FP use, and generating a reasonably stable numerator value to use in metrics such as FP_Ops/second.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In addition to what John has said, I would like to point out that another Intel product, Advisor 2017, does have FLOPS collection, when enabled in the trip counts analysis tab of the project properties.
https://software.intel.com/sites/default/files/managed/6c/d6/release_notes_advisor_xe.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! Actually I figured out that a full list of events can be found in the installed documentation.
I gave Advisor a try but I am sticking with VTune even if I have to do the math myself.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alex S. (Intel) wrote:In addition to what John has said, I would like to point out that another Intel product, Advisor 2017, does have FLOPS collection, when enabled in the trip counts analysis tab of the project properties.
https://software.intel.com/sites/default/files/managed/6c/d6/release_notes_advisor_xe.pdf
hello Alex,
I used Intel Advisor to calculate the Flops, but it results:
Elapsed Time: 136.81s
Total CPU time: 245.166
Time in 46 vectorized loops: 37.7451
GFLOPS: 0
GINTOPS: 0.01
do you know how to resolve such a problem ?
thank you

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page