Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

64K Aliasing Conflict

rubendix1
Beginner
994 Views
Hello,

I have a problem. I dont understand the "64K Aliasing Conflict" in the "Intel Pentium 4 and Intel Xeon Processor Optimization (Reference manual)". I have also looked into the "VTune Performance Analyzer Help". My problem is that: in the explanation given by them, I cannot point any conflict with L1 cache. Could anybody help me? Attending to the VTune Performance Analyzer Im having serious problems with "64K Aliasing Conflicts" in my project.

Thanks in advance,

Ruben
0 Kudos
8 Replies
bnshah
Beginner
994 Views
Look for loads and stores that are 64k apart w/ a debugger or using printf statements (if possible). Is the program multithreaded? If so and you have two worker threads that are running the same code running at the same time the stacks may be causing the 64k aliasing conflict. On Windows use the _alloca function to offset the stack (but make sure it is still aligned on a cache line). 64k aliasing can be fixed by using padding to make sure loads and stores are not a multiple of 64k apart. Hope this helps. If not please follow up
0 Kudos
rubendix1
Beginner
994 Views
Thank you Birju, but my problem wasnt that. My problem was: "Why did the "64k aliasing conflict" appear?" "Why did the Intel P4 architecture had 64k aliasing conflict?"
I couldnt understand: "why could P4-L1-cache had that conflict?"
Now, I know why does it happen?. Thanks birju,


rubendix1
0 Kudos
dgutson
Beginner
994 Views

Hi, my question is quite similar. VTune is telling me that I'm having some 64k aliasing conflicts, but I failed to detect them by printing the pointer values.

It's only one thread, one function.

This function works with a number of buffers; I printed out the starting positions of them (and knowing their sizes) but I couldn't find -analytically- where the 64k modulus distance occurs.

Supposing I have two buffers accessed in the same function:

buffer1: starts at [addr1], size [len1]

buffer2: starts at [addr2], size [len2]

Is there any condition's formula to determine whether I'll get 64k aliasing conflicts or not?
To be clear about what I'm asking for, I'll show you the 'prototype' of my question:
bool willIget64kConflict(const void* addr1, size_t len1, const void* addr2, size_t len2);
I don't expect someone come with the function, I just need the idea or algorithm or pseudocode. It seems that I'm doing something wrong since I cannot find the conflict with the addresses I'm using.
Please let me know.
Thanks!
Daniel.
0 Kudos
David_A_Intel1
Employee
994 Views

The manual explains that if bits 15-6 of the two memory addresses being accessed are the same, then a 64K aliasing conflict will occur. By adding 64 to one of the addresses, you make the value in bits 15-6 different and thereby remove the conflict. That is, if you are traversing the buffers in order from beginning to end, lock stop, offsetting the start of one of the buffers will remove the conflicts. Otherwise, you will experience a conflict on every single memory access.

Message Edited by dlanders on 04-27-2004 04:02 PM

0 Kudos
dgutson
Beginner
994 Views

Hi, thanks for your answer.

I thought that bits under consideration were 15-0 rather than 6 -> 15, that's why I 'anded' my pointer values by 0xFFFF. In fact, my clause was (given ptr1 and ptr2)

bool conflict = ((unsigned int) ptr1 & 0xFFFF) == ((unsigned int) ptr2 & 0xFFFF);

and it happened that NONE of the accessed addresses of the function fit this condition, and that's why I could not find where the '64k al. conflict' occurs.

Note that the function is written in ASM, stack is used _only_ at the beginning for obtaining one parameter, and the ESP is never accessed again.

I will really appreciate your help.

Regards,

Daniel.

0 Kudos
fern
Beginner
994 Views
I've got a questiuon here.
Since the each way of the cache is only 2K, which is 32 sets, we only need 5 bits to address the sets.
Then why do all 16-5 bits need to agree in order to have 64K aliasing?
I guess the core of the question is: how many and what bits are used for set addressing ?
0 Kudos
isn-removed200717
994 Views

Why "if bits 15-6 of the two memory addresses being accessed are the same, then a 64K aliasing conflict will occur"? Can anybody give a detailed description?

Thanks

0 Kudos
David_A_Intel1
Employee
994 Views

Hi FERN and zhen_heng:

All I can do is point you to the reference manual: http://www.intel.com/design/pentium4/manuals/248966.htm

On page 105 (2-43), the 64k Aliasing Conflict is described, including which bits for which Pentium 4 and Xeon processor models.

The line size is 64 bytes, which explains why bit 0-5 are ignored. From page 2-41, "Note that first-level cache lines are 64 bytes. Thus the least significant 6 bits are not considered in alias comparisons."

Also, the definition of the data conflict may be enlightening, "Data conflict can only have one instance of the data in the first-level cache at a time. If a reference (load or store) occurs with its linear address matching a data conflict condition with another reference (load or store) which is under way, then the second reference cannot begin until the first one is kicked out of the cache.On Pentium 4 and Intel Xeon processors with CPUID signature of family encoding 15, model encoding of 0, 1 or 2, the data conflict condition applies to addresses having identical value in bits 15:6 (also referred to as 64K aliasing conflict)."

Now, why this condition causes a conflict is obviously an architectural issue and I cannot explain that. I believe what matters is that if you detect a high number of conflicts with the analyzer, you could improve yourcode's performance by mitigating the conflict.

Anyone else have any ideas/answers?

Message Edited by DaveA on 06-21-2004 03:09 PM

0 Kudos
Reply