## What is the formula of Data transfers between CPU and memory sub-system (total traffic, including L1, L2, LLC and DRAM traffic)?

I can compute GLOPS based on #instructions and instruction type.

For example, I have 17694720000 FMA3, so I have 17694720000*3*8*3*8=283GFLOPS. Great! Intel advisor gives the same number.

However, how to get Data transfers between CPU and memory sub-system (total traffic, including L1, L2, LLC and DRAM traffic)? (The number in the bottom right. It does not match with #instructions for memory load.

 AVX; FMA Instruction Set 3.842s Self time Dynamic Instruction Mix Summary Memory 34% (11796480000)   Vector 34% (11796480000)   AVX 34% (11796480000)   Compute 57% (19722240000)   Vector 51% (17694720000)   FMA 51% (17694720000)   Scalar 6% (2027520000)   x86 6% (2027520000)     Other 9% (3133440000)
Statistics for FLOPS And Data Transfers
 Self GFLOPS 808.89193 Giga Floating-point Operations Per Second Self GFLOPS = Self GFLOP / Self Elapsed Time Self AI 2.18182 Self AI - Self Arithmetic Intensity - Ratio Of Self Floating-Point Operations To Self L1 Transferred Bytes Self GFLOP 283.11552 Giga Floating-Point Operations, Not Including GFLOP For Functions Called In The Loop Or Function Self Elapsed Time 0.350s Elapsed Time Is The Exclusive (Self-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Elapsed Time Is Equal To Self-Time Total Elapsed Time 0.350s Total Elapsed Time Is The Inclusive (Total-Time-Based) Wall Time From The Beginning To The End Of Loop/Function Execution. For Single-Threaded Applications Total Elapsed Time Is Equal To Total-Time
Data transfers between CPU and memory sub-system (total traffic, including L1, L2, LLC and DRAM traffic)
 In Giga Bytes, Not Including Transfers For Functions Called In The Loop Or Function 129.761 In Giga Bytes Per Second 370.742

