Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Exceptions inside enumerable_thread_specific::local()

John_F_1
Beginner
538 Views

enumerable_thread_specific::local() lazily copy constructs thread local storage. What should happen if we try to allocate too much memory in the copy constructor of the thread local object? In this case, a std::bad_alloc is thrown. This causes the members which were constructed during member initialization to be destructed as the stack is unwound. When the enumerable_thread_specific is destructed, it manually calls the destructor on the object which threw the exception which can result in double frees.  For example, if we modify the example given in the documentation as follows (heavily contrived just to demonstrate what can happen):

#include <cstdio>
#include <utility>
#include <atomic>
#include <iostream>
#include "tbb/task_scheduler_init.h"
#include "tbb/enumerable_thread_specific.h"
#include "tbb/parallel_for.h"
#include "tbb/blocked_range.h"

struct Bar
{
    ~Bar()
    {
        std::cout << "Bar::~Bar() called on " << this << std::endl;
    }
    int bar1;
};

struct ExceptionThrower
{
    ExceptionThrower() : m_pair(std::make_pair(0,0)) 
    {
        MaybeThrowException();
    }
    ExceptionThrower(const ExceptionThrower& other) : m_pair(other.m_pair)
    {
        MaybeThrowException();
    }
    ~ExceptionThrower()
    {
    }

    void MaybeThrowException()
    {
        int currCount = ++s_count;
        if (currCount == 8)
        {
            std::cout << "Throwing bad_alloc" << std::endl;
            throw std::bad_alloc();
        }
    }
    static std::atomic<int> s_count;
    std::pair<int,int> m_pair;
};

struct Foo
{
    Foo() : m_bar(), m_exceptionThrower() { }
    Foo(const Foo& other) : m_exceptionThrower(other.m_exceptionThrower)
    {
    }
    ~Foo()
    {
    }
    Bar m_bar;
    ExceptionThrower m_exceptionThrower;
};


std::atomic<int> ExceptionThrower::s_count = 0;
typedef tbb::enumerable_thread_specific< Foo > CounterType;
Foo exemplar;
CounterType MyCounters(exemplar);

struct Body {
    void operator()(const tbb::blocked_range<int> &r) const {
        CounterType::reference my_counter = MyCounters.local();
        ++my_counter.m_exceptionThrower.m_pair.first;
        for (int i = r.begin(); i != r.end(); ++i)
            ++my_counter.m_exceptionThrower.m_pair.second;
    }
};

int main() {
    tbb::parallel_for( tbb::blocked_range<int>(0, 100000000), Body());
}

The output is:

Throwing bad_alloc
Bar::~Bar() called on 000000000082BC80
Bar::~Bar() called on 0000000000883FF0
Bar::~Bar() called on 0000000000887000
Bar::~Bar() called on 0000000000887080
Bar::~Bar() called on 0000000000857800
Bar::~Bar() called on 0000000000857880
Bar::~Bar() called on 000000000082BC00
Bar::~Bar() called on 000000000082BC80

As you can see, the Bar at

000000000082BC80

was destructed twice. It appears that this is a fallout of using placement new in enumerable_thread_specific. What is the right way to handle this? It appears to be a flaw in the design of enumerable_thread_specific. I've temporarily handled it by using a wrapper class consisting of a unique_ptr that calls unique_ptr::reset(nullptr) in the destructor, thus ensuring the destruct on my object only happens once.

0 Kudos
2 Replies
John_F_1
Beginner
538 Views

Of course, to see the above output, you will probably have to handle the exception:

    try{
        tbb::parallel_for( tbb::blocked_range<int>(0, 100000000), Body());
    } catch (std::exception& e)
    {
        std::cout << e.what() << std::endl;
    }

 

0 Kudos
Christophe_H_Intel
538 Views

Hello, John,

I believe the problem is the placement-new constructor destroys any already-created objects on exception, but the space is already allocated, the concurrent_vector that holds the ETS has had its size increased, but the object does not exist.  The destructor for the ETS calls the destructor for all "slots" in the concurrent_vector, including the one which was not successfully built.

Because the slots are asynchronously-allocated, we cannot just decrement the size of the concurrent_vector, but must remember whether the slot was successfully constructed.

Thank you very much for reporting the problem.

Regards,
Chris

0 Kudos
Reply