Solved: Re: Trying to run a pipeline in its own background thread

turks · ‎01-22-2009

I'm trying to have a separate thread run the pipeline which I've created in the current thread.
The reason is so that after I start it, control flow returns immediately and the pipeline runs simultaneously to my current process.

If one doesn't do this, the code has to wait for the pipeline to run to completion.
The current thread wants to be still creating input for the pipeline.
The pipeline's first filter actually does a pop from a concurrent_queue of data to process.
That data was and continues to be pushed into that concurrent_queue by the current process.
That queue part works well and even makes use of Intel TBB's scalable malloc/free.

Is this a reasonable thing to try to do or was tbb::pipeline perhaps not designed to work in a background thread as it creates its own subthreads?

What happens is that the pipeline run() returns immediately without any of its filter code getting executed.
When the run is done in the same thread that built the pipeline, breakpoints in the overloaded operator() parts of each filter show that the code then does get executed.

RafSchietekat · ‎01-23-2009

You don't make clear exactly how you switch between the two versions, but it seems potentially problematic that the filters are automatic variables in BuildAndStartRunningPipeline(), because the pipeline does not copy them or anything. What happens if you make them member variables instead, or keep pointers to filters allocated on the heap?

View solution in original post

Alexey-Kukanov · ‎01-22-2009

I think the setup you described should work.
Can you possibly share some code that exhibits the issue?

turks · ‎01-22-2009

Quoting - Alexey Kukanov (Intel)

I think the setup you described should work.
Can you possibly share some code that exhibits the issue?

Gladly.
The files are attached. (3 small ones) It has the working position for the pipeline run commented.
I hope you're right.

I included the header of the Merger class only as an FYI extra. This is part of a large project that includes much legacy, third party, and our own C code with latest sections in C++. I've identified the main area to benefit from multi-threading and have "lifted out the C code" and made the part I want parallel into the Merger class. Thus there are many externals; they are effectively not changing at this point so I am thinking it is ok for them not to be thread safe data.
The many member variables of the Merger class ARE thread-safe with multiple instances of the Merger class.
The execution before and after Merging is all serial, but takes less time anyway.
The tbb pipeline concept is thus perfect.
I'm aiming to get the best of both worlds - have the huge bulk of the existing code remain unchanged yet have the performance scale with the number of CPUs.
This is my second brick wall.
Your help is appreciated.
Thanks.

Alexey-Kukanov · ‎01-23-2009

I see no attachment unfortunately.

RafSchietekat · ‎01-23-2009

Does the new thread have its own task_scheduler_init?

turks · ‎01-23-2009

Quoting - Alexey Kukanov (Intel)

I see no attachment unfortunately.

Oops. They appear to have been uploaded but not attached.
Here they are again, attached.

Raf, yes, there is a scheduler init in the thread function.
Tnx

Mitch

RafSchietekat · ‎01-23-2009

You don't make clear exactly how you switch between the two versions, but it seems potentially problematic that the filters are automatic variables in BuildAndStartRunningPipeline(), because the pipeline does not copy them or anything. What happens if you make them member variables instead, or keep pointers to filters allocated on the heap?

Alexey-Kukanov · ‎01-23-2009

Raf is right; since filters are automatic variables in the function, they are destroyed when the function returns, and their destructors remove each from the pipeline. So when the latter runs, it is basically empty and thus exits immediately.

RafSchietekat · ‎01-23-2009

"So when the latter runs, it is basically empty and thus exits immediately." If you're lucky: race issues (no, not those...) might produce worse outcomes that that. Pipeline administration has to be separate from running it.

Alexey-Kukanov · ‎01-24-2009

/*Raf's last post #8 is meant; can't reply directly to it*/
Of course you are right Raf. Though in most cases the execution will luckily complete I think, due to the relative cost of thread start being bigger than that of executing three virtual destructors, there definitely is a race and the consequences could be more severe. And in the real application the cost might change.

turks · ‎01-26-2009

Quoting - turks

I'm trying to have a separate thread run the pipeline which I've created in the current thread.
The reason is so that after I start it, control flow returns immediately and the pipeline runs simultaneously to my current process.

If one doesn't do this, the code has to wait for the pipeline to run to completion.
The current thread wants to be still creating input for the pipeline.
The pipeline's first filter actually does a pop from a concurrent_queue of data to process.
That data was and continues to be pushed into that concurrent_queue by the current process.
That queue part works well and even makes use of Intel TBB's scalable malloc/free.

Is this a reasonable thing to try to do or was tbb::pipeline perhaps not designed to work in a background thread as it creates its own subthreads?

What happens is that the pipeline run() returns immediately without any of its filter code getting executed.
When the run is done in the same thread that built the pipeline, breakpoints in the overloaded operator() parts of each filter show that the code then does get executed.

Yes! That was it, Raf.

I had to first declare pointers to the filters as member variables, with
MyInputFilter *m_input_filter;
MyTransformFilter *m_transform_filter;
MyOutputFilter *m_output_filter;

and in order to get that to compile I had to first declare,
class MyInputFilter;
class MyTransformFilter;
class MyOutputFilter;

Now the pipeline must get built with:
m_pPipeline = new tbb::pipeline;
m_input_filter = new MyInputFilter(this);
m_pPipeline->add_filter(*m_input_filter);
...

And now when the subthread executes the "pPipeline->run()",
the code in the filters DOES get called.

Thank you, Raf (and Alexey for corroborating).

In case anyone else is interested, my now working version of the Pipeliner class that can run the pipeline in a background thread, is attached.

Life is good.