Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring

40-bit PMC counters, can they be used for marking start/stops

halivingston
Beginner
265 Views

FIXED_CTR0 has instructions retired being counted. If the counter is only 40-bit (pre-Nehalem), is there a use of this counter without sampling?

To paraphrase, if all I'm doing is RDPMC on a certain code site, and then RDPMC on another code site, this is bound to overflow right?

Or does that not matter if I always subtract the two numbers and multiply it by negative 1?

0 Kudos
1 Solution
Patrick_F_Intel1
Employee
266 Views
Hello halivingston,
There is always the danger of overflowing the counters if they are used in the free-running mode (or sometimes called 'counting mode').
We can estimate how long before the counter will wrap:

Max instructions/sec @ 5 inst/cycle= frequency * 5 inst/cycle
Just for ease of computing say the freq is 2 Ghz then we could get 10 billion (1.0e10)instructions/sec.
A 40 bit counter can count 1,099,511,627,775 (or ~1.1e12) before wrapping.
So, at this example 2GHz frequency and 5 instructions/cycle, we could run about 100 seconds before wrapping.

Of course, some CPUs have a higher frequency but many codes probably don't sustain retiring 5 instructions/cycle.
So... depending on these factors, the wrap around may be longer.
Pat

View solution in original post

4 Replies
Bernard
Black Belt
266 Views
Try to consultIntel Manuals.
Patrick_F_Intel1
Employee
267 Views
Hello halivingston,
There is always the danger of overflowing the counters if they are used in the free-running mode (or sometimes called 'counting mode').
We can estimate how long before the counter will wrap:

Max instructions/sec @ 5 inst/cycle= frequency * 5 inst/cycle
Just for ease of computing say the freq is 2 Ghz then we could get 10 billion (1.0e10)instructions/sec.
A 40 bit counter can count 1,099,511,627,775 (or ~1.1e12) before wrapping.
So, at this example 2GHz frequency and 5 instructions/cycle, we could run about 100 seconds before wrapping.

Of course, some CPUs have a higher frequency but many codes probably don't sustain retiring 5 instructions/cycle.
So... depending on these factors, the wrap around may be longer.
Pat
petminuti
Beginner
266 Views
Isn't there a better answer I am more precise?

http://deportedirecto.es/categories.php?category=Mujer

Homepage
Business in Italy
Work from home
Homepage (5)
Patrick_F_Intel1
Employee
266 Views

In what way would you like the answer to be more precise?

Reply