Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

How to use parallel_reduce for vector

FlorentD
Beginner
180 Views

Hi,

 

I need to analyze N nodes that will generate M new nodes using parallel_reduce :

 

 

struct MyReduction {
    MyClassPtr* _this = nullptr;
    const std::vector<Node*>& bestNodes;
    std::atomic<size_t> threadId{0};

    std::vector<Node*> value;

    MyReduction(MyClassPtr* _t, const std::vector<Node*>& n) : _this(_t), bestNodes(n) {}
    MyReduction(MyReduction& s, tbb::split) : _this(s._this), bestNodes(s.bestNodes) {
        threadId = 0;
        value = std::vector<Node*>();
    }

    void operator()( const tbb::blocked_range<size_t>& r ) {
        const auto lThreadId = threadId++; // atomic

        auto tmp = value;
        const auto tab = _this->_analyseNodes(bestNodes, r.begin(), r.end(), lThreadId);
        tmp.insert(tmp.end(), tab.cbegin(), tab.cend());

        value = tmp;
    }

    void join(MyReduction& rhs) {
        value.insert(value.end(), rhs.value.begin(), rhs.value.end());
    }
};

 

 

And the call :

 

 

MyReduction result(this, bestNodes);
tbb::parallel_reduce(tbb::blocked_range<size_t>(0, bestNodes.size()), result);

 

 

The result after the reduction, i.e variable "result.value" is not correct and I think there is a data race inside the operator() because I never have the same size of this vector.

 

If I add a mutex inside the operator() function, it works...

 

 

void operator()( const tbb::blocked_range<size_t>& r ) {
        const auto lThreadId = threadId++;

        static std::mutex mutex;
        mutex.lock();
        auto tmp = value;
        const auto tab = _this->_analyseNodes(bestNodes, r.begin(), r.end(), lThreadId);
        tmp.insert(tmp.end(), tab.cbegin(), tab.cend());
        value = tmp;

        mutex.unlock();
    }

 

 

What is wrong with my reduction ?

 

Note that if I use the lambda version of parallel_reduce, it works perfectly but I want to use the imperative form to avoid temporaries and copies.

 

Thanks

 

 

 

0 Kudos
1 Solution
NoorjahanSk_Intel
Moderator
124 Views

Hi,

 

Thanks for reaching out to us.

 

In order to remove data races in the code, one way of synchronization methods is musing mutexes as these are used to simplify writing race-free code.

Please refer to the below link for more details:

https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Mutual_Exclusion.html

 

>>What is wrong with my reduction?

Could you please help us in reproducing your issue by providing the complete reproducer and expected results?

 

Thanks & Regards,

Noorjahan.

 

View solution in original post

3 Replies
NoorjahanSk_Intel
Moderator
125 Views

Hi,

 

Thanks for reaching out to us.

 

In order to remove data races in the code, one way of synchronization methods is musing mutexes as these are used to simplify writing race-free code.

Please refer to the below link for more details:

https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Mutual_Exclusion.html

 

>>What is wrong with my reduction?

Could you please help us in reproducing your issue by providing the complete reproducer and expected results?

 

Thanks & Regards,

Noorjahan.

 

NoorjahanSk_Intel
Moderator
105 Views

Hi,


We haven't heard back from you. Could you please provide an update on your issue?


Thanks & Regards,

Noorjahan.


NoorjahanSk_Intel
Moderator
82 Views

Hi,


We have not heard back from you, so we will close this inquiry now. If you need further assistance, please post a new question.


Thanks & Regards,

Noorjahan.


Reply