concurrent_unordered_map: good hash functionsj for lists of 32-bit integers, character strings, etc
I am disappointed with the lack of documentation regarding the hashing for concurrent_unordered_map. In particular, as a drop in replacement for boost's concurrent_unordered_map, they have completely different hash behavior.
the concurrent_unordered_map uses a skip-order list. Essentially the data structure is a linked list of all your data, and the "buckets" are pointers into specific entries of this list. To be honest, even after reading this paper late last night, I'm still a bit confused as to the particular invariants and implementations, but, the biggest important difference, is that the paper seems to use a table size of size 2^k. Consider (by example), taking any key (unsigned int) and reducing it mod 2^k. For example, if k = 1, then only the least significant bit of the key matters for determining a bucket. i.e. the lowest k significant bits determine the bucket. So a good hash here should enforce randomness on those k bits.
I'll be posting more to this soon with more explanation as I have it. I hope that those particularly well informed, could post more on this matter.
One question I have, is, if I know in advance the total size of this hash table, how should I choose the buckets as to minimize space and search time?