Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

TBB on linux segfaulting

Carey
Beginner
2,828 Views
I have a cross platform program I've written that uses either TBB or OpenMP (not at the same time!). The code is in every respect cross platform, but I initially developed it in Visual Studio.

The program is a framework for developing n-body models.

On Windows it works perfectly, compiling with no errors or warnings, and runs as expected. On Linux (Ubuntu 64 bit) it compiles well, just one warning that wouldn't effect the TBB code, but when I run it, as soon as it tries to begin the first parallel_for class, it segfaults.
I tried using openMP, and the code ran perfectly. Note that the OpenMP and TBB code do not interfere with each other, using them at the same time isn't possible.

Since the parallel code is in my Runge Kutta 4th Order Integrator, I decided to comment out the first Parallel_for block and see if that had a bug in it (replacing it with equivilent serial code), but instead it just segfaults on running the next one it encounters. I deduce from this, and the fact that the same code runs perfectly on Windows, that it is the act of using TBB which is causing the segfault, not the code itself.

On Linux I've compiled it using GCC, with the project managed by Code::Blocks. I've linked tbb, and initialised tbb with a call to
tbb::task_scheduler_init init;

Have I missed something about using TBB on Linux?

I haven't posted lots of code because I rather think the issue is one of setting up TBB, rather than a problem in my code. I can do though, but the project is rather large, so it would be a fair bit to go through.
0 Kudos
22 Replies
Carey
Beginner
424 Views

Certainly TBB is designed to let you write:

[cpp]    int main( int argc, char* argv[] ) {
tbb::task_scheduler_init init;
...do rest of program logic...
}[/cpp]

Indeed that is normally the way I use it and we havemanyunit tests that use it in a way similar to this. If it segfaulted in this usage model on any of our many test platforms, we would have noticed.

I can conjecture a way the broken code with races might appear to have less chance of segfaulting if the task_scheduler_init is constructed at a more inner scope. Suppose the broken code as a latent race, that if exposed, causesa segfault. When TBB constructs a task_scheduler_init object, it does not wait for the worker threads to get started. If a task_scheduler_init object exists for a very short time, the parallel code may finish before the worker threads get a chance to help. Thus the race (and consequent segfault) will not happen.









While I'm not 100% sure there is no problem with the code (who ever can be?) I've tested the code with OpenMP, and had no problems, and this problem does not appear on windows with 2.2, it only happens if I use a previous version of TBB on linux, and thus have to create the task_scheduler_init object. Of course this might just be because not having to create the task_scheduler_init object means the error is being hidden.

I can feel a headache coming on..sigh..

I don't *think* it should generate a race condition, each particle class in my n-body model only handles its own movement, there are no writes to other particles, so each particle can run in its own thread without the need for critical sections.

I also made sure that those members of a particle class that would be accessed by other particles during integration (to find it's position during gravitational force calculation) are copies of those members, not the ones the particle class itself is using, so there is no reading of class members that are also being written to.

For example, a particle class holds its own x axis position in a member particle->x, and modifies it during integration, but before parallel integration happens, it copies this to particle->x_ext, and that is what other particles read.

There are simultanious reads, but I'm not aware of that being an issue.

I've stopped any possibility of two particles trying to read and write to the same class members at the same time, I just can't see how a race condition can emerge.

There is one thing though. to get my array of particles into the tbb parallel_for I pass a pointer to the array. This array is an array of pointers to objects, not an STL vector or anything nice like that, because they are too slow.

Might it be that I'm using the array this way that I have the issue?

I can probably re-write so I don't have to do it this way. Not quite sure how just yet, I still can't use STL.


edit:: I had my tbb code in a seperate header that I linked in to the class that called it, and wrote wrappers to make class members that called the tbb code. Very convoluted and probably silly. I just moved the tbb code into the class that calls it, rather than that header, and did away with the wrappers.

As a result I'm no longer getting the thread_monitor::launch: _beginThreadex failed issue. At least not in ten runs of the program, where before it happen each time, about 3/4 through thr run.
That's promising. I'll boot into Linux and see if the segfault problem is also gone.

0 Kudos
Carey
Beginner
424 Views

I do beleive I now know what the problem was.

I had my tbb code badly organised (see edit on previous post). There weren't any race conditions, but it does seem my attempt to keep all my tbb code in a seperate header was a bad plan. I can't see why, since it ran on the latest TBB, so obviously that can cope, but the older version in Ubuntu couldn't, so there must be something about that aproach which causes issues.

A little re-organisation of code, getting rid of that header and putting the tbb code inside the class that was calling it, and the correct aproach for tbb::task_scheduler_init
[cpp]    int main( int argc, char* argv[] ) {
tbb::task_scheduler_init init;
...do rest of program logic...
}[/cpp]

Now works as it should. I've run the program 20+ times, and the immediate segfault isn't happening, it's just running as it should.

Time for a cup of tea and a moment of being releived.


0 Kudos
Reply