Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

stride misalignment on purpose

BMart1
New Contributor II
299 Views

Hi,

I remember reading a whitepaper I can't now find that argued that you shouldn't over-align, specially for Pentium 4. Image stride should be a multiple of 64 but not 128, to prevent columns fighting for the same cache lines. Otherwise most of the cache is useless. For example, if a 32KB cache is 4-way associative and each line 64 bytes long, only 4x64 = 256 bytes are used.

iw doesn't take this precaution. Maybe it should? Any pointers to read more on this and why Pentium 4 had it worse than other CPUs?

Bruno

0 Kudos
1 Reply
Ying_H_Intel
Employee
299 Views

Hi Bruno,

​Do you mean the cache bank conflict, generally the architecture and software were discussed in IA software developer manual:

https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

section of 3.6.8.

https://software.intel.com/en-us/forums/software-tuning-performance-optimization-platform-monitoring/topic/280663

Best Regards,

Ying

0 Kudos
Reply