Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5140 Discussions

FP Single Precision Packed SIMD operations in DP code

schorscherl
Beginner
557 Views
Hi,

This is an excerpt from a VTune (for linux) sampling activity with
'Packed Single-precision Floating-point Streaming SIMD Extension Instructions Retired' and 'Clockticks':

0x1eef6 219 1842 5006 for ( k=0; k "smaller than" n; k++ )
220 0 0 {
0x1efd9 221 64438 12381 sre+=(m1r*m2r+m1i*m2i);
0x1f002 222 13996 4692 sim+=(m1r*m2i-m1i*m2r);
223 0 0 }
0x1f3cc 224 0 41 rr=sre; ri=sim;

^
SIMD
^
Clockticks

(view in monospace font to make sense of it).

k,n,j are integers, the rest is double scalars or pointers - no
float involved. Nevertheless, VTune counts abovementioned events,
and they make up for a significant fraction of the overall FLOPs
count (measured by another tool by summing up x87, packed &
scalar SP & DP SIMD).

So what is going on here?

TIA,
Georg.
0 Kudos
0 Replies
Reply