Making my own hash array.

kio_marv · ‎05-07-2012

Hi,
I'm trying to use tbb::concurrent_hash_map to implement a wrapper class in order to use it in my programs. So far, I have done this:

[cpp]#include "tbb/concurrent_hash_map.h" #include template struct HashCompare { static size_t hash( const K& key ) { return tbb::tbb_hasher(key); } static size_t hash( K* key ) { return reinterpret_cast( key/sizeof(key) ); } static bool equal( const K& key1, const K& key2 ) { return ( key1 == key2 ); } }; template class HashArray { private: static unsigned int initial_hash_size; tbb::concurrent_hash_map > h; const HashArray& operator=( const HashArray& S ); public: HashArray(): h(initial_hash_size) {} HashArray( const HashArray& S ) : h( S.h ) {} ~HashArray() {} const E& operator[]( const K& key ) const{ typename tbb::concurrent_hash_map >::const_accessor a; if ( !h.find(a,key) ) { h.insert(a,key); } return a->second; } E& operator[](const K& key){ typename tbb::concurrent_hash_map >::accessor a; if ( !h.find(a,key) ) { h.insert(a,key); } return a->second; } int IsDefined(const K key) const { if (h.count(key)!=0) return 1; else return 0; } void UnDefine(const K key) { typename tbb::concurrent_hash_map >::accessor a; if( h.find(a,key) ) { h.erase(a); } } void GatherStats(std::ostream& O) const { typename tbb::concurrent_hash_map >::const_iterator it( h.begin() ); while ( it != h.end() ) { O << ((*it).second).size(); ++it; } } void Clear(void) { h.clear(); } void Resize(void) { h.clear(); } }; template unsigned int HashArray::initial_hash_size=1000;[/cpp] I have tested this code in a simple program and it works well. However, I would like to hear what you think about it. What I fear is that it's bad code and it might fail in a large program.

Thanks!

kio_marv · ‎05-10-2012

Hello? Anyone?

Anton_M_Intel · ‎05-16-2012

Hi,

Sorry, it is completely wrong approach. The bad code is marked in comments below

Quotingkio.marv

I'm trying to usetbb::concurrent_hash_mapto implement a wrapper class in order to use it in my programs.

Usually it is bad idea to wrap concurrent_hash_map as an STL-like serial container. It leads to fatal threading errors.

    static size_t hash(
      K* key

Why a pointer? shouldn't it be a reference instead?

      return reinterpret_cast( key/sizeof(key) );

Are you aware that the actual key is a pointer to a key value? E.g. if you have two instances of the same value, they will be considered as different keys.

    static bool equal( 
        const K& key1, 
        const K& key2 
    ) {
        return ( key1 == key2 );
    }

However, here, you are comparing the values of keys, not pointers.

      const E& operator[](
          const K& key
      ) const{
        typename tbb::concurrent_hash_map >::const_accessor a;
        if ( !h.find(a,key) ) {
            h.insert(a,key);
        } 
        return a->second;
      }

find() is excessive as insert does the same job. And it is not thread-safe as you return the reference without protection by accessor while using erase().

      E& operator[](const K& key){

Concurrent modifications are also not protected.

      void UnDefine(const K key) {
        typename tbb::concurrent_hash_map >::accessor a;
        if( h.find(a,key) ) {
          h.erase(a);
        }
      }

Well, it is the right way to prevent other threads from working with a concurrently erased instance. But I guess it was not the intention.. because without correctly used accessors, find() is excessive here as well.

I have tested this code in a simple program and it works well. However, I would like to hear what you think about it. What I fear is that it's bad code and it might fail in a large program.

You are right.

SergeyKostrov · ‎05-16-2012

Hi everybody,

Quoting kio.marv

...I have tested this code in a simple program and it works well. However, I would like to hear what you think about it. What I fear is that it's bad code and it might fail in a large program...

Here are a couple of questions:

- Did you try to evaluateperformance ofyourcodes and tocomparewith performance ofHash classesofSTL or Boost libraries?

- Did you do a Stress Testing with as larger as possibledata sets?

- Could you submit a Test-Case?

Best regards,
Sergey