- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, all.
My CPU is Core architecture(T7100), I found in datasheet there was a event, FP_COMP_OPS_EXE, for monitoring floating point mico-ops.
And I write a very very simple benchmark,test.c to test this counter, like
int main(void)
{
float i;
i=i+0.01;
}
then gcc -o test.out test.c
then,I use perf to monitor, the commond is, (0010 is Umask|Event_number): perf stat -e r0010 ./test.out &
And get the result
Performance counter stats for './test.out':
1,398 raw 0x10
0.001437684 seconds time elapsed
My question is how can understand the number 1,398. Accurately, my code only contains one FADD operation. Is that means the FADD is translated into 1,398 micro-ops? or I misundestand the meaning of micro-ops ?
Thank you.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Chenjie,
The monitoring utility 'perf' doesn't start monitoring at your main(). It starts monitoring before your program is loaded. So it counts (probably) some uops in perf, some uops due to loading your program, some uops due to initializing everything for your program and then, after all that, the instructions in your program. And then the uops for cleaning up after your program, and returning to perf. Since your program includes floating point, linux may (I'm 99.999% sure) also setup extended save/restore registers to hold the sse2 state in case of context switches.
Lastly, you need to look at the disassemby of the binary to see what your program is actually doing. It may or may not be doing what you think... especially since you don't return any value or print anything out... the compiler may (as an optimization) just be executing a return. And since you are using an uninitalized variable 'i', you might be getting exceptions.
You could try inserting a loop to see if there is a base number of uops that always gets executed (say when the loop count==0) and a number of uops that increased in proportion to the loop. That would probably provide more insights.
Pat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Chenjie,
The monitoring utility 'perf' doesn't start monitoring at your main(). It starts monitoring before your program is loaded. So it counts (probably) some uops in perf, some uops due to loading your program, some uops due to initializing everything for your program and then, after all that, the instructions in your program. And then the uops for cleaning up after your program, and returning to perf. Since your program includes floating point, linux may (I'm 99.999% sure) also setup extended save/restore registers to hold the sse2 state in case of context switches.
Lastly, you need to look at the disassemby of the binary to see what your program is actually doing. It may or may not be doing what you think... especially since you don't return any value or print anything out... the compiler may (as an optimization) just be executing a return. And since you are using an uninitalized variable 'i', you might be getting exceptions.
You could try inserting a loop to see if there is a base number of uops that always gets executed (say when the loop count==0) and a number of uops that increased in proportion to the loop. That would probably provide more insights.
Pat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Patrick, thank you. I get your idea.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page