Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

Performance monitoring of mulss and imul on SMT

proy4
Beginner
174 Views

I am trying to understand port utilization of sandy-bridge while running multiplication.

I am running three versions of multiplication, in one version 2 sibling SMT thread is running floating point multiplication(mulss)(case 1)(port 0). Another version performs integer multiplication(imul)(case 2)(port 1) in sibling SMT threads and in final version, one sibling SMT thread is running mulss(port 0) and another thread running imul(case 3)(port 1). When I measure port utilization of port 0 and 1 using UOPS_DISPATCHED_PORT it seems that port 0 and 1 utilization is similar for case 1 and case 3. But it was expected that port 1 should be more utilized in case 3 compared to case 1 as port 1 performs imul operation.

UOPS_DISPATCHED_PORT:PORT_1 measures cycles per thread, does it mean it can observe only one thread and cannot report about the other sibling SMT thread?

0 Kudos
0 Replies
Reply