Scalable hash map algorithm

Dmitry_Vyukov · ‎09-11-2008

I think this can interesting for this forum too.
I've posted implementation of highly scalable concurrent hash map on TBB forum:
http://software.intel.com/en-us/forums//topic/60494
(there is a C++ source attached)

Implementation uses a bunch of advanced synchronization techniques, like timestamp validation, deferred memory reclamation and cache conscious data layout. As a result this implementation beats TBB's concurrent_hash_map by a factor of 50 on read-mostly workload on Intel Core 2 Quad (Q6600), and by a factor of 20 on modest write workload. The cost of read transaction (find operation) is about 30 cycles, which is basically equal to that of single-threaded hash map. Algorithm also have perfect linear scalability on read-mostly workload so on greater number of cores/processors it will have even higher performance difference with traditional lock-based synchronization (on 8 cores I expect >100x).

Such synchronization techniques, I believe, is the way to deal with future many-core systems.