Re: evalute cost of instructions in loop

karimfath · ‎03-19-2009

hello
i want evalute the cost of read/write operation in dual core system when we used vtune they give me this result
for(i=0;it;
i dont understand the signification:only for instruction whithout reading take 38.60%
thanks

srimks · ‎03-19-2009

Quoting - karimfath

hello
i want evalute the cost of read/write operation in dual core system when we used vtune they give me this result
for(i=0;it;
i dont understand the signification:only for instruction whithout reading take 38.60%
thanks

I think the complete statement within FOR loop is missing. One can understand similar behaviour if you refer Getting_Started Guide on Vtune. Do follow & read the complete document, it simply 15 pages.

~

karimfath · ‎04-24-2009

Quoting - srimks

I think the complete statement within FOR loop is missing. One can understand similar behaviour if you refer Getting_Started Guide on Vtune. Do follow & read the complete document, it simply 15 pages.

~

this is the code of the first loop in my program :

"0x9FC","63","","for(i=0;i
"0x9FC","63","","main.omp_fn.0+0x31: movl $00, -8(%ebp)",""
"0xA03","63","","main.omp_fn.0+0x38: movlr 08(%ebp), %eax","7.55%"
"0xA06","63",""," movlr (%eax), %eax","0.35%"
"0xA08","63",""," cmpl %eax, -8(%ebp)","1.39%"
"0xA0B","63",""," jnge main.omp_fn.0+0x89 (0x8048a54)","0.55%"
"0xA6B","63",""," addl $01, -8(%ebp)","31.63%"
"0xA6F","63",""," jmp main.omp_fn.0+0x38 (0x8048a03)","2.09%"
"","64","","{",""
"0xA54","65","","t=0;","16.68%"
"0xA54","65","","main.omp_fn.0+0x89: movlr -8(%ebp), %eax","7.20%"
"0xA57","65",""," shll $02, %eax",""
"0xA5A","65",""," movl %eax, %edx",""
"0xA5C","65",""," movlr 08(%ebp), %eax","0.10%"
"0xA5F","65",""," movlr 04(%eax), %eax","8.19%"
"0xA62","65",""," leal (%edx, %eax, 1), %eax","0.30%"
"0xA65","65",""," movl $00, (%eax)","0.89%"
"","66","","}",""
it is easy to see that instruction t=0; has a cost of 16,68% and we found the equivalent assembler code of this instruction

but in the second loop we found this:

"0x96B","89","","for(i=0;i
"0x96B","89","","main.omp_fn.1+0x2e: movl $00, -8(%ebp)",""
"0x972","89","","main.omp_fn.1+0x35: movlr 08(%ebp), %eax","7.65%"
"0x975","89",""," movlr (%eax), %eax","0.05%"
"0x977","89",""," cmpl %eax, -8(%ebp)","5.51%"
"0x97A","89",""," jnge main.omp_fn.1+0x86 (0x80489c3)","15.74%"
"0x9C3","89","","main.omp_fn.1+0x86: addl $01, -8(%ebp)","9.19%"
"0x9C7","89",""," jmp main.omp_fn.1+0x35 (0x8048972)","1.64%"
"","90","","{",""
"","91","","t;",""
"","92","","}",""
we used -O0 option with gcc and icc when i compiled my program
i need to inderstand this because i want to measure the cost of access to memory for writing and reading
thanks for help and excuse me for my bad english

TimP · ‎04-24-2009

The only reliable timing information here is the time spent in the entire loop. The attribution of time events to instructions later than the one which may be most responsible is termed informally "skid." When you consider the out-of-order pipelined nature of the CPU, allowing several instructions to be running at a time, you can understand that VTune can't tell you the cost of an individual instruction. Sometimes it is estimating the time an instruction spends waiting for its operands to become available.
Measuring cost of memory access is a far more complicated task, requiring analysis of cache behavior etc.

karimfath · ‎04-24-2009

Quoting - tim18

The only reliable timing information here is the time spent in the entire loop. The attribution of time events to instructions later than the one which may be most responsible is termed informally "skid." When you consider the out-of-order pipelined nature of the CPU, allowing several instructions to be running at a time, you can understand that VTune can't tell you the cost of an individual instruction. Sometimes it is estimating the time an instruction spends waiting for its operands to become available.
Measuring cost of memory access is a far more complicated task, requiring analysis of cache behavior etc.

hello tim18
so we can consider that the second loop time is the time of reading the all quantity of memory
thanks