Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4994 Discussions

evalute cost of instructions in loop

karimfath
Beginner
553 Views
hello
i want evalute the cost of read/write operation in dual core system when we used vtune they give me this result
for(i=0;it;
i dont understand the signification:only for instruction whithout reading take 38.60%
thanks
0 Kudos
4 Replies
srimks
New Contributor II
553 Views
Quoting - karimfath
hello
i want evalute the cost of read/write operation in dual core system when we used vtune they give me this result
for(i=0;it;
i dont understand the signification:only for instruction whithout reading take 38.60%
thanks

I think the complete statement within FOR loop is missing. One can understand similar behaviour if you refer Getting_Started Guide on Vtune. Do follow & read the complete document, it simply 15 pages.

~
0 Kudos
karimfath
Beginner
553 Views
Quoting - srimks

I think the complete statement within FOR loop is missing. One can understand similar behaviour if you refer Getting_Started Guide on Vtune. Do follow & read the complete document, it simply 15 pages.

~
this is the code of the first loop in my program :

"0x9FC","63","","for(i=0;i
"0x9FC","63","","main.omp_fn.0+0x31: movl $00, -8(%ebp)",""
"0xA03","63","","main.omp_fn.0+0x38: movlr 08(%ebp), %eax","7.55%"
"0xA06","63",""," movlr (%eax), %eax","0.35%"
"0xA08","63",""," cmpl %eax, -8(%ebp)","1.39%"
"0xA0B","63",""," jnge main.omp_fn.0+0x89 (0x8048a54)","0.55%"
"0xA6B","63",""," addl $01, -8(%ebp)","31.63%"
"0xA6F","63",""," jmp main.omp_fn.0+0x38 (0x8048a03)","2.09%
"

"","64","","{",""
"0xA54","65","","t=0;","16.68%"

"0xA54","65","","main.omp_fn.0+0x89: movlr -8(%ebp), %eax","7.20%"
"0xA57","65",""," shll $02, %eax",""
"0xA5A","65",""," movl %eax, %edx",""
"0xA5C","65",""," movlr 08(%ebp), %eax","0.10%"
"0xA5F","65",""," movlr 04(%eax), %eax","8.19%"
"0xA62","65",""," leal (%edx, %eax, 1), %eax","0.30%"
"0xA65","65",""," movl $00, (%eax)","0.89%"

"","66","","}",""
it is easy to see that instruction t=0; has a cost of 16,68% and we found the equivalent assembler code of this instruction

but in the second loop we found this:

"0x96B","89","","for(i=0;i
"0x96B","89","","main.omp_fn.1+0x2e: movl $00, -8(%ebp)",""
"0x972","89","","main.omp_fn.1+0x35: movlr 08(%ebp), %eax","7.65%"
"0x975","89",""," movlr (%eax), %eax","0.05%"
"0x977","89",""," cmpl %eax, -8(%ebp)","5.51%"
"0x97A","89",""," jnge main.omp_fn.1+0x86 (0x80489c3)","15.74%"
"0x9C3","89","","main.omp_fn.1+0x86: addl $01, -8(%ebp)","9.19%"
"0x9C7","89",""," jmp main.omp_fn.1+0x35 (0x8048972)","1.64%"

"","90","","{",""
"","91","","t;",""
"","92","","}",""

we used -O0 option with gcc and icc when i compiled my program
i need to inderstand this because i want to measure the cost of access to memory for writing and reading
thanks for help and excuse me for my bad english
0 Kudos
TimP
Honored Contributor III
553 Views
The only reliable timing information here is the time spent in the entire loop. The attribution of time events to instructions later than the one which may be most responsible is termed informally "skid." When you consider the out-of-order pipelined nature of the CPU, allowing several instructions to be running at a time, you can understand that VTune can't tell you the cost of an individual instruction. Sometimes it is estimating the time an instruction spends waiting for its operands to become available.
Measuring cost of memory access is a far more complicated task, requiring analysis of cache behavior etc.
0 Kudos
karimfath
Beginner
553 Views
Quoting - tim18
The only reliable timing information here is the time spent in the entire loop. The attribution of time events to instructions later than the one which may be most responsible is termed informally "skid." When you consider the out-of-order pipelined nature of the CPU, allowing several instructions to be running at a time, you can understand that VTune can't tell you the cost of an individual instruction. Sometimes it is estimating the time an instruction spends waiting for its operands to become available.
Measuring cost of memory access is a far more complicated task, requiring analysis of cache behavior etc.
hello tim18
so we can consider that the second loop time is the time of reading the all quantity of memory
thanks
0 Kudos
Reply