Analyzers
Support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
4700 Discussions

## how to interpret FP_SSE values Beginner
199 Views
I've read that one SSE register contains 128 bits, however a double uses 64 bits. If I count the SSE operations and I get e.g. 10.000, does this mean, that there are 20.000 floating point operations? For me, it is interesting, how many flops had happend, so I sum up the events FP_X87 and FP_SSE, but I don't know how to interpret the FP_SSE values.
4 Replies Black Belt
199 Views
Quoting - benarisse
I've read that one SSE register contains 128 bits, however a double uses 64 bits. If I count the SSE operations and I get e.g. 10.000, does this mean, that there are 20.000 floating point operations? For me, it is interesting, how many flops had happend, so I sum up the events FP_X87 and FP_SSE, but I don't know how to interpret the FP_SSE values.
You would have to count total SSE operations and SSE parallel operations, in order to assess how many operations were parallel, if that is your goal. If all your parallel operations are on doubles, then you could might calculate your result. If you can safely assume that all the SSE operations were parallel, your assumption follows from that.
Unfortunately, there are cases where an SSE parallel operation is performed on a single operand for consistency or to optimize register renaming. You wouldn't be able to count those in VTune, unless you refer to your source code to check whether it is serial (including remainder loops for vectorized code) or true useful parallel operations in vectorized code. Beginner
199 Views
Quoting - tim18
You would have to count total SSE operations and SSE parallel operations, in order to assess how many operations were parallel, if that is your goal. If all your parallel operations are on doubles, then you could might calculate your result. If you can safely assume that all the SSE operations were parallel, your assumption follows from that.
Unfortunately, there are cases where an SSE parallel operation is performed on a single operand for consistency or to optimize register renaming. You wouldn't be able to count those in VTune, unless you refer to your source code to check whether it is serial (including remainder loops for vectorized code) or true useful parallel operations in vectorized code.

I am not interested in how many operations were parallel. I like to know the absolut and relative number of floating point operations executed during a run of a special program. Therefore ich count the events FP_COMP_OPS_EXE.SSE_FP and FP_COM_OPS_EXE.X87. The sum ot fhese two events should be the absulut number of floating point operations executed. And to get the relative number, I divided this sum by INST_RETIRED.ANY events.
At the moment, I don't care about parallelism - sometimes I start the program with only one thread. Black Belt
199 Views
Quoting - benarisse

I am not interested in how many operations were parallel. I like to know the absolut and relative number of floating point operations executed during a run of a special program. Therefore ich count the events FP_COMP_OPS_EXE.SSE_FP and FP_COM_OPS_EXE.X87. The sum ot fhese two events should be the absulut number of floating point operations executed. And to get the relative number, I divided this sum by INST_RETIRED.ANY events.
At the moment, I don't care about parallelism - sometimes I start the program with only one thread.
The word "parallel" is in use for Instruction Level Parallelism or SIngle Instruction Multiple Data parallel instructions, and for threaded, message passing, and other forms of higher level parallelism. You could get a count of SSE parallel (SIMD) instructions, in case you wish to distinguish them from the SSE serial (single operand) instructions. If you aren't interested, I don't see the point of your previous posts. Beginner
199 Views
Quoting - tim18
The word "parallel" is in use for Instruction Level Parallelism or SIngle Instruction Multiple Data parallel instructions, and for threaded, message passing, and other forms of higher level parallelism. You could get a count of SSE parallel (SIMD) instructions, in case you wish to distinguish them from the SSE serial (single operand) instructions. If you aren't interested, I don't see the point of your previous posts.

I got it! Thanks again! 