I was doing some experiments with Intel Advisor 2020 and in particular with the roofline model. Something I can't quite understand is why the peak scalar integer performance (intop/cycle) is different than the theoretical one that I would expect especially since all other metrics match more or less (vector integer performance, floating point..)
In particular according to Intel Advisor the max peak performance (for add/mul) is around 2.3 integer operations per cycle while the theoretical value I would expect to find is 4 intop/cycle since we have 4 INT ALU in 4 different ports.
Am I missing something?
Thanks for noticing this problem! We will investigate the issue - there are no obvious extra hardware limits for scalar integer ops, so our benchmark may provide suboptimal value.
general answer from engineering team: "Advisor benchmarks are sensitive to CPU usage. Make sure CPU wasn't actively used by other process during advisor collection run. It is difficult to be more specific without cpu device id, e.g, please send advisor project.