Dear experts,
I have a question regarding the "Retire Stalls" hardware-event metric.
Hereit says:
This metric is defined as a ratio of the number of cycles when no micro-operations are retired to all cycles. In the absence of performance issues, long latency operations, and dependency chains, retire stalls are insignificant. Otherwise, retire stalls result in a performance penalty. On Intel microarchitecture codename Nehalem, this metric is based on precise events that do not suffer from significant skid.
From the definition, I would think the ratio should always be less than 1. However, in my Amplifier XE run, I see this number can be as large as 39 in my application.
Could someone shed some light on this number? Thanks a lot!