As can be seen in Tutorial.pdf about TBB, the TBBmemory allocator is designed for parallel programming including scalable_allocator and cache_aligned_allocator. I use TBB memory allocator by the automatic replacement method (set LD_PRELOAD environment variable). What surprises me is that it speedup my serial program by 80%. BTW, my serial program consumes about 2.6G memory.
So can anyone tell me what does TBB memory allocator do to allocat memory so effeciently, compared with the STL memory allocator? Thanks.