BlockS structure in MemoryAllocator.cpp should not be padded for PPC64. I added _ARCH_PPC64 to the padding check. This got rid of the assert but I believe for a more complete solution architecture defines in TypeDefinitions.h should include a define for PPC64 as well.
Thanks for such a great product.
/* verified by initMemoryManager() */
(Added) cache_aligned_allocator uses twice CACHE_LINE_SIZE's value (128 vs. 64), which might need to be reconciled or explained somehow (I didn't look any further yet), and it seems strange that all hardware should agree on the same value?
Yes, I am going to do padding in a similar way though not exactly the same.
The cache_aligned_allocator has the constant to be 128 for the sake of platforms where cache lines are 128 bytes (e.g. Intel Itanium processor). Again, you are right that ideally it should be determined at runtime for each HW (and I hope it will be reworked this way); but as most of current CPUs (well, at least Intel processors) have either 64 or 128 bytes in a cache line, for simplicity the biggest of the two was chosen.
In the scalable allocator, it wasconsidered less important; but my current vision is that good cache behavior is more important than smaller memory overhead; thus I will rework it soon.The block header will be of 128 bytes; and its fields changed by "foreign" threads will reside in the second half while the fields changed by owning thread only will reside in the first half. I expect it will improve performance on the hot path for systems with 64 bytes per cache line, and on Itanium processorit will at least eliminate the situation where the block header share its cache line with data.