- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My app has some areas that show a very high performance impact from 64k aliasing in VTune. After studying the problem I've put together some test applications to try to understand why I would be experiencing 64k aliasing. The code below shows a 64k aliasing performance impact of about 80, which is 40x the recommended top end. However, it works only in 2 2K L1 cache regions that are seperated by exactly 8MB + 32K, so there should be no possibility of any 64k aliasing events at all. Can you explain why this is throwing so many events??
void Test1()
{
char *a = (char *)VirtualAlloc( 0, 16 * 1024 * 1024 + 0x8000, MEM_COMMIT, PAGE_READWRITE );
DWORD dwStart = timeGetTime();
float *src = (float *)a;
float *dest = (float *)(a + 8 * 1024 * 1024 + 0x8000);
__asm
{
mov esi, src
mov edi, dest
mov edx, 819200
add esi, 0x800
add edi, 0x800
outer:
mov ecx, 0x800
neg ecx
inner:
movaps xmm0, xmmword ptr [esi+ecx]
movaps xmmword ptr [edi+ecx], xmm0
add ecx, 0x10
jnz inner
dec edx
jnz outer
}
printf( "Elapsed time = %d ms
", timeGetTime() - dwStart );
VirtualFree( a );
}
Message Edited by dneufeld on 10-09-2004 06:29 PM
Link Copied

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page