[bash]void Fun()The code generated for the loop in Fun() is great:
{
float *x, *a, *b;
...
for(i = 0; i < n; ++i)
x = a + b;
...
}
void main()
{
Fun();
...
}[/bash]
[bash]00401663 movaps xmm0,xmmword ptr [esi+eax*4]Now, if we add another function in main() before calling Fun():
00401667 movaps xmm1,xmmword ptr [esi+eax*4+10h]
0040166C addps xmm0,xmmword ptr [ecx+eax*4]
00401670 addps xmm1,xmmword ptr [ecx+eax*4+10h]
00401675 movaps xmmword ptr [edi+eax*4],xmm0
00401679 movaps xmmword ptr [edi+eax*4+10h],xmm1
0040167E add eax,8
00401681 cmp eax,dword ptr [ebp-78h]
00401684 jb TestPerf1+2E3h (401663h)
[/bash]
[bash]void main()where Prepare() is very complex and Fun() is NOT changed, then the code for the same loop in Fun() is now:
{
Prepare();
Fun();
...
}
[/bash]
[bash]00412BD0 mov edi,dword ptr [ebp-40h]
00412BD3 movaps xmm0,xmmword ptr [edi+eax*4]
00412BD7 movaps xmm1,xmmword ptr [edi+eax*4+10h]
00412BDC mov edi,dword ptr [ebp-34h]
00412BDF addps xmm0,xmmword ptr [edi+eax*4]
00412BE3 addps xmm1,xmmword ptr [edi+eax*4+10h]
00412BE8 mov edi,dword ptr [ebp-4Ch]
00412BEB movaps xmmword ptr [edi+eax*4],xmm0
00412BEF mov edi,dword ptr [ebp-4Ch]
00412BF2 movaps xmmword ptr [edi+eax*4+10h],xmm1
00412BF7 add eax,8
00412BFA cmp eax,edx
00412BFC jb TestPerf1+2D0h (412BD0h)[/bash]
Link Copied
[bash]00412BE8 mov edi,dword ptr [ebp-4Ch] 00412BEB movaps xmmword ptr [edi+eax*4],xmm0 00412BEF mov edi,dword ptr [ebp-4Ch] <******* edi reloaded 00412BF2 movaps xmmword ptr [edi+eax*4+10h],xmm1 [/bash]
For more complete information about compiler optimizations, see our Optimization Notice.