Analyzers
Community support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
4819 Discussions

FP Single Precision Packed SIMD operations in DP code

schorscherl
Beginner
275 Views
Hi,

This is an excerpt from a VTune (for linux) sampling activity with
'Packed Single-precision Floating-point Streaming SIMD Extension Instructions Retired' and 'Clockticks':

0x1eef6 219 1842 5006 for ( k=0; k "smaller than" n; k++ )
220 0 0 {
0x1efd9 221 64438 12381 sre+=(m1r*m2r+m1i*m2i);
0x1f002 222 13996 4692 sim+=(m1r*m2i-m1i*m2r);
223 0 0 }
0x1f3cc 224 0 41 rr=sre; ri=sim;

^
SIMD
^
Clockticks

(view in monospace font to make sense of it).

k,n,j are integers, the rest is double scalars or pointers - no
float involved. Nevertheless, VTune counts abovementioned events,
and they make up for a significant fraction of the overall FLOPs
count (measured by another tool by summing up x87, packed &
scalar SP & DP SIMD).

So what is going on here?

TIA,
Georg.
0 Kudos
0 Replies
Reply