- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can compute GLOPS based on #instructions and instruction type.
For example, I have 17694720000 FMA3, so I have 17694720000*3*8*3*8=283GFLOPS. Great! Intel advisor gives the same number.
However, how to get Data transfers between CPU and memory sub-system (total traffic, including L1, L2, LLC and DRAM traffic)? (The number in the bottom right. It does not match with #instructions for memory load.
AVX; FMA
Instruction Set
|
3.842s
Self time
|
AVX
34% (11796480000)
FMA
51% (17694720000)
x86
6% (2027520000)
Other
9% (3133440000)
|
Statistics for FLOPS And Data Transfers
Self GFLOPS | 808.89193 | Giga Floating-point Operations Per Second Self GFLOPS = Self GFLOP / Self Elapsed Time |
Self AI | 2.18182 | Self AI - Self Arithmetic Intensity - Ratio Of Self Floating-Point Operations To Self L1 Transferred Bytes |
Self GFLOP | 283.11552 | Giga Floating-Point Operations, Not Including GFLOP For Functions Called In The Loop Or Function |
Self Elapsed Time | 0.350s | Elapsed Time Is The Exclusive (Self-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Elapsed Time Is Equal To Self-Time |
Total Elapsed Time | 0.350s | Total Elapsed Time Is The Inclusive (Total-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Total Elapsed Time Is Equal To Total-Time |
Data transfers between CPU and memory sub-system (total traffic, including L1, L2, LLC and DRAM traffic)
In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function | 129.76128 | |
In Giga Bytes Per Second | 370.74213 |
Link Copied
0 Replies
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page