Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2466 Discussions

How to use hash code to add node into concurrent_hash_map directly? (to improve concurrent_hash_map's performace which use a string key)

ahfuqq_com
Beginner
206 Views
Hi, Greate TBB Engineers,
when I use concurrent_hash_map with a string or char[] key, in some condition, I wish there are some new interface in concurrent_hash_map that I can use pre-computed hash code of string to improve performance.
it is known that string will use hasher function to convert ascii to a 32 bits hash code. If there are greate number of strings, every time when I insert or find a string, I have to run such a hasher.
I have a idea that if I pre-compute hash code for every string first, and when I need to insert or find, I use hash code and key as input parameters of concurrent_hash_map.
eg: hash.insert(my_accessor, "my_string", hasher("my_string"));
//insert will use hash code to insert this node, if two different strings have same hash code,
// insert will use chain to link two nodes.

Am I find a right way to improve performace? If I wrong, please tell me the left way. If that idea is possible, could u tell me how to change code?

Thanks!
And wish god forgive I, me can't speak English.

From a China guy Ahfu.
0 Kudos
1 Reply
Anton_M_Intel
Employee
206 Views
Quoting - ahfuqq.com
Hi, Greate TBB Engineers,
when I use concurrent_hash_map with a string or char[] key, in some condition, I wish there are some new interface in concurrent_hash_map that I can use pre-computed hash code of string to improve performance.
it is known that string will use hasher function to convert ascii to a 32 bits hash code. If there are greate number of strings, every time when I insert or find a string, I have to run such a hasher.
I have a idea that if I pre-compute hash code for every string first, and when I need to insert or find, I use hash code and key as input parameters of concurrent_hash_map.
eg: hash.insert(my_accessor, "my_string", hasher("my_string"));
//insert will use hash code to insert this node, if two different strings have same hash code,
// insert will use chain to link two nodes.

Am I find a right way to improve performace? If I wrong, please tell me the left way. If that idea is possible, could u tell me how to change code?

Thanks!
And wish god forgive I, me can't speak English.

From a China guy Ahfu.

Hi,

Yes, it's right way to optimize your hash table. But you can actually do it without any additions to the interface. Just define a proxy class that stores both keyand hash values.
E.g. (i didn't check the code):

class cached_hash_key {
public:
const key_type key;
const size_t hash;
cached_hash_key( key_type k )
: key(k), hash(tbb::tbb_hash_compare::hash(k))
{}
};
class cached_hash_compare {
public:
bool equal( const cached_hash_key& j, const cached_hash_key& k ) const {
return j.hash==k.hash && j.key==k.key;
}
size_t hash( const cached_hash_key& k ) const {
return k.hash;
}
};

tbb::concurrent_hash_map my_table;
0 Kudos
Reply