According to my experience the standard Linux memory allocators dramatically affects the scalability of parallel program. This could become a barier on the way to scalable threaded applicationsfor future many-core CPUs. I believe using scalable allocators is a must for most applications but it is pretty hard to explicitly introduce them in a large amount of legacy code. Did you think of standardizing the scalable memory allocators and driving the OS developers to make such allocators a part of operating system?
I used TBB to parallelize my existing serial application. While really delighted by the simplicity of implementing parallel patterns with TBB I faced a problem of replacing standard memory allocators by scalable ones in a large amount of existing code (wich I call "legacy") to enable good speedups. The simplest and most efficient solution for me was to replace standard new/delete operators, and I used LD_PRELOAD to enable it in all my dynamic libraries. But I believe it is a "hack".
That is why I am thinking about OS-level support for scalable allocators which probably would avoid the programmer to care about them at all.I meanthat if multithreading is our future it would be natural to havescalable OS functions.Does it make sense?
Well, there are plans afoot, if you have the patience to wait. The c++0x proposal has a section on multi-tasking memory allocationto support the native threading facilities planned for that environment. This seems like the most natural way to migrate to OS-level support, and as a standard has the promise drive similar improvements into proprietary OSes as well.