I am trying to count the UOPS dispatched to execution ports on Haswell platform using the hardware performance counters. The relevant performance event for this is UOPS_EXECUTED_PORT with individual umasks for counting executions on PORT_0 to PORT_7. This event works fine for counting UOPS dispatched to individual ports.
But I will like to count UOPS dispatched to non-memory ports (Port 0, 1, 5, 6) using one counter but I am not sure how to do it. On Nehalem, there was a specific umask for this: UOPS_EXECUTED:PORT015 but this is not listed for Haswell. I tried to combine umasks for Port_0(0x01), Port_1(0x02), Port_5(0x20) and Port_6(0x40) into one event as UOPS_EXECUTED_PORT:0x63 but the value provided by this event doesn't match the summation of executions on individual ports.
Does anyone know if it is possible?
The UOPS_EXECUTED_PORT.PORT_* events for Haswell carry the description:
Cycles which a uop is dispatched on port * in this thread.
Combining the umasks for different ports should count cycles in which a uop is dispatched on any one of the requested ports -- not the sum of the uops requested on those ports.
Based on this description as a "cycle count" event, it does not appear that you can get the sum with a single event -- you will need to count separately on the four ports and add the results.
I just noticed that the event description changed after Nehalem. On Nehalem, the relevant event was UOPS_EXECUTED:PORT* and description was "Counts the number of Uops executed on port *". On both Sandy Bridge (UOPS_DISPATCHED_PORT.PORT_*) and Haswell (UOPS_EXECUTED_PORT.PORT_*), the description changed to cycles in which Uop is dispatched to port. That was partly why I was confused.
I guess I have to stick to using separate counters for counting these events.