- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all,
i want to measure the double precision SSE MFlops/s on i7/Nehalem.
What are the right events to do so?
On Core 2 there were the events:
SIMD_COMP_INST_RETIRED_PACKED_DOUBLE and
SIMD_COMP_INST_RETIRED_SCALAR_DOUBLE
from this I could easily compute the MFlops.
Now on i7 such an event is missing:
I have:
FP_COMP_OPS_EXE_SSE_FP_PACKED
FP_COMP_OPS_EXE_SSE_FP_SCALAR
and
FP_COMP_OPS_EXE_SSE_SINGLE_PRECISION
FP_COMP_OPS_EXE_SSE_DOUBLE_PRECISION
From this I cannot compute e.g. packed double precision uops.
While there are the events:
SSEX_UOPS_RETIRED_PACKED_DOUBLE
SSEX_UOPS_RETIRED_SCALAR_DOUBLE
They include not only the computational instructions. The
result is therefore not accurate enough.
Do I miss anything or is it just not possible to measure
e.g. double precision SSE MFlops on i7.
Greetings and thank you for your help,
Jan
i want to measure the double precision SSE MFlops/s on i7/Nehalem.
What are the right events to do so?
On Core 2 there were the events:
SIMD_COMP_INST_RETIRED_PACKED_DOUBLE and
SIMD_COMP_INST_RETIRED_SCALAR_DOUBLE
from this I could easily compute the MFlops.
Now on i7 such an event is missing:
I have:
FP_COMP_OPS_EXE_SSE_FP_PACKED
FP_COMP_OPS_EXE_SSE_FP_SCALAR
and
FP_COMP_OPS_EXE_SSE_SINGLE_PRECISION
FP_COMP_OPS_EXE_SSE_DOUBLE_PRECISION
From this I cannot compute e.g. packed double precision uops.
While there are the events:
SSEX_UOPS_RETIRED_PACKED_DOUBLE
SSEX_UOPS_RETIRED_SCALAR_DOUBLE
They include not only the computational instructions. The
result is therefore not accurate enough.
Do I miss anything or is it just not possible to measure
e.g. double precision SSE MFlops on i7.
Greetings and thank you for your help,
Jan
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Jan,
There are other event in Core i7/Nehalem which you could use to achieve your objective (as I understand it you're looking to measure number of packed and scalar floating point operations for double and single precision numbers to use them to calculate FLOPS rate). Please look at
EventNum: 10H, Umask value: 10H, FP_COMP_OPS_EXE.SSE_FP_PACKED - Counts number of SSE FP packed uops executed.
EventNum: 10H, Umask value: 20H, FP_COMP_OPS_EXE.SSE_FP_SCALAR - Counts number of SSE FP scalar uops executed.
EventNum: 10H, Umask value: 40H, FP_COMP_OPS_EXE.SSE_SINGLE_PRECISION - Counts number of SSE* FP single precision uops executed.
EventNum: 10H, Umask value: 80H, FP_COMP_OPS_EXE.SSE_DOUBLE_PRECISION - Counts number of SSE* FP double precision uops executed.
BTW, to count all FLOPS, you may add FP_COMP_OPS_EXE.X87 (EventNum: 10H, Umask value: 01H) event, as some applications (predominantly 32-bit ones) still may be using x87.
The detailed description of Core i7/Nehalem events is available in Appendix A2 PERFORMANCE MONITORING EVENTS FOR INTEL COREI7 PROCESSOR FAMILY in Intel 64 and IA-32 Architectures Software Developers Manual
Volume 3B: System Programming Guide, Part 2. It is available for download from our web-site www.intel.com/products/processor/manuals/.
Hope this helps!
There are other event in Core i7/Nehalem which you could use to achieve your objective (as I understand it you're looking to measure number of packed and scalar floating point operations for double and single precision numbers to use them to calculate FLOPS rate). Please look at
EventNum: 10H, Umask value: 10H, FP_COMP_OPS_EXE.SSE_FP_PACKED - Counts number of SSE FP packed uops executed.
EventNum: 10H, Umask value: 20H, FP_COMP_OPS_EXE.SSE_FP_SCALAR - Counts number of SSE FP scalar uops executed.
EventNum: 10H, Umask value: 40H, FP_COMP_OPS_EXE.SSE_SINGLE_PRECISION - Counts number of SSE* FP single precision uops executed.
EventNum: 10H, Umask value: 80H, FP_COMP_OPS_EXE.SSE_DOUBLE_PRECISION - Counts number of SSE* FP double precision uops executed.
BTW, to count all FLOPS, you may add FP_COMP_OPS_EXE.X87 (EventNum: 10H, Umask value: 01H) event, as some applications (predominantly 32-bit ones) still may be using x87.
The detailed description of Core i7/Nehalem events is available in Appendix A2 PERFORMANCE MONITORING EVENTS FOR INTEL COREI7 PROCESSOR FAMILY in Intel 64 and IA-32 Architectures Software Developers Manual
Volume 3B: System Programming Guide, Part 2. It is available for download from our web-site www.intel.com/products/processor/manuals/.
Hope this helps!
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page