How to use hash code to add node into concurrent_hash_map directly? (to improve concurrent_hash_map's performace which use a string key)

ahfuqq_com · ‎05-18-2009

Hi, Greate TBB Engineers,
when I use concurrent_hash_map with a string or char[] key, in some condition, I wish there are some new interface in concurrent_hash_map that I can use pre-computed hash code of string to improve performance.
it is known that string will use hasher function to convert ascii to a 32 bits hash code. If there are greate number of strings, every time when I insert or find a string, I have to run such a hasher.
I have a idea that if I pre-compute hash code for every string first, and when I need to insert or find, I use hash code and key as input parameters of concurrent_hash_map.
eg: hash.insert(my_accessor, "my_string", hasher("my_string"));
//insert will use hash code to insert this node, if two different strings have same hash code,
// insert will use chain to link two nodes.

Am I find a right way to improve performace? If I wrong, please tell me the left way. If that idea is possible, could u tell me how to change code?

Thanks!
And wish god forgive I, me can't speak English.

From a China guy Ahfu.

Anton_M_Intel · ‎05-18-2009

Quoting - ahfuqq.com

Hi, Greate TBB Engineers,
when I use concurrent_hash_map with a string or char[] key, in some condition, I wish there are some new interface in concurrent_hash_map that I can use pre-computed hash code of string to improve performance.
it is known that string will use hasher function to convert ascii to a 32 bits hash code. If there are greate number of strings, every time when I insert or find a string, I have to run such a hasher.
I have a idea that if I pre-compute hash code for every string first, and when I need to insert or find, I use hash code and key as input parameters of concurrent_hash_map.
eg: hash.insert(my_accessor, "my_string", hasher("my_string"));
//insert will use hash code to insert this node, if two different strings have same hash code,
// insert will use chain to link two nodes.

Am I find a right way to improve performace? If I wrong, please tell me the left way. If that idea is possible, could u tell me how to change code?

Thanks!
And wish god forgive I, me can't speak English.

From a China guy Ahfu.

Hi,

Yes, it's right way to optimize your hash table. But you can actually do it without any additions to the interface. Just define a proxy class that stores both keyand hash values.
E.g. (i didn't check the code):

class cached_hash_key {
public:
const key_type key;
const size_t hash;
cached_hash_key( key_type k )
: key(k), hash(tbb::tbb_hash_compare::hash(k))
{}
};
class cached_hash_compare {
public:
bool equal( const cached_hash_key& j, const cached_hash_key& k ) const {
return j.hash==k.hash && j.key==k.key;
}
size_t hash( const cached_hash_key& k ) const {
return k.hash;
}
};

tbb::concurrent_hash_map my_table;