tbb_thread

uj · ‎12-07-2009

A while ago I usedTBB tasks to implement "longrunning" tasks. Due to the way tasks work such a thread will block one core of the CPU as long as it's running. That's not very good soI'm now going to usetbb_threads instead of tasks.

The tbb_threads will be waiting for information coming from the main program. Previously I used a concurrent_queue for this communication. I have a thin producer-consumer wrapper around theconcurrent_queue whichworked in two modes, one quick response mode for rapid bursts of information, and one slow response mode for long waits while nothing happens. In quick mode the concurrent_queueuses try_pop as long as the queue isn't empty but then it switches to waiting on a mutex when the queue is empty. This part of my solution seems to be working well.

Now my question is, which is the best way to communicate between the main program and a tbb_thread? Is it a good ideatoreuse my concurrent_queue wrapper described above or is there something wrong with this approach? Maybe concurrent_queue isn't a good choise? Will it work at all?Maybethere's a much simpler or smarter or more obvious way to do it?

I really need some good advice here. Thank you.

RafSchietekat · ‎12-07-2009

I think that the mutex is probably redundant at best (because concurrent_queue already has one), and you haven't shown any code to see whether you're using it correctly, because it can go wrong in at least two ways (sabotaging performance and/or correctness).

Also note that blocking on a concurrent_queue from within a task is still blocking, just like blocking on I/O. I would try to find a way to run such a task only on demand.

Dmitry_Vyukov · ‎12-07-2009

Quoting - uj

Now my question is, which is the best way to communicate between the main program and a tbb_thread? Is it a good ideatoreuse my concurrent_queue wrapper described above or is there something wrong with this approach? Maybe concurrent_queue isn't a good choise? Will it work at all?Maybethere's a much simpler or smarter or more obvious way to do it?

In general, concurrent_queue must work. I do not see anything wrong with this.
And once again in general, asynchronous message passing is considered as a good approach regarding design/simplicity.

Regarding simpler/smarter, it depends on details and requirements. For example, in some situations simple atomic variable may be used. Main thread waits for it to become zero, then puts data object into it. Worker thread wait for it to become non-zero, then sets it to zero.

Dmitry_Vyukov · ‎12-07-2009

Quoting - Raf Schietekat

Also note that blocking on a concurrent_queue from within a task is still blocking, just like blocking on I/O. I would try to find a way to run such a task only on demand.

Btw, one way to handle this is to use asynchronous continuations. I.e. while queue is not empty consumer (long-running operation) just dequeues elements as usual, producer (main thread) just enqueues elements as usual.
However if consumer task finds queue empty, it creates continuation for the current operations and finishes (returns from task::execute()). Then when producer submits next element to the queue, it also spawns task to continue preempted operation. This way you can get around blocking in worker thread.

RafSchietekat · ‎12-07-2009

"Btw, one way to handle this is to use asynchronous continuations."
I've had itchy fingers about just such a proposal for a while, but I was waiting for something else to be integrated first... Do you mean anything specific with "continuations", or could it be just any old task?

Dmitry_Vyukov · ‎12-07-2009

Quoting - Raf Schietekat

"Btw, one way to handle this is to use asynchronous continuations."
I've had itchy fingers about just such a proposal for a while, but I was waiting for something else to be integrated first... Do you mean anything specific with "continuations", or could it be just any old task?

I think it may be any task (new or old).
You may see one example of continuations and reusing current task as own continuation in examplestasktree_sumOptimizedParallelSumTree.cpp

RafSchietekat · ‎12-07-2009

"I think it may be any task (new or old)."
Sorry, "any old task" was just an expression, meaning "without any particular requirement".

I myself am still confused by your use of the word "continuation", though. For me this means the technique where a parent task terminates after setting up a new task to wait for the child tasks to terminate. But here the task would be spawned by an unknown task.

uj · ‎12-07-2009

Quoting - Raf Schietekat

I think that the mutex is probably redundant at best (because concurrent_queue already has one), and you haven't shown any code to see whether you're using it correctly, because it can go wrong in at least two ways (sabotaging performance and/or correctness).

Also note that blocking on a concurrent_queue from within a task is still blocking, just like blocking on I/O. I would try to find a way to run such a task only on demand.

I already have a version of my producer-consumer queue that doesn't do any longtime sleeping. Insteadit quits when the queue is emptied. I'm using TBB tasks now but it should be easy to use tbb_thread for the same purpose. This will increase overhead so probably a thread-pool would be good but I'll wait with that until the next C++ version is here. It's included there right? Still I would rather use a TBB only solution for all concurrency. I may be mistaken but I have the impression the overall performance would be better because of better integration.

Regarding your comment about concurrent_thread. Are you indicating thatit already works the way my producer-consumer queue works? That it doesn't spin but truely sleepsduring periods of inactivity? I'm fairly certain I must have investigated this beforemakingmy own consumer-producer queue. Otherwise I wouldn't have bothered with thatin the firstplace. Has concurrent_thread changed recently in this respect?

Dmitry_Vyukov · ‎12-07-2009

Quoting - Raf Schietekat

"I think it may be any task (new or old)."
Sorry, "any old task" was just an expression, meaning "without any particular requirement".

I myself am still confused by your use of the word "continuation", though. For me this means the technique where a parent task terminates after setting up a new task to wait for the child tasks to terminate. But here the task would be spawned by an unknown task.

I use term continuation in a more general sense. When last child task re-schedules parent task is just a private case of continuation. For example, if you use asynchronous IO you save operation context in continuation object, issue the IO request, then OS IO discovery mechanism somehow allows you to resume operation via continuation.

uj · ‎12-07-2009

Quoting - Dmitriy Vyukov

Btw, one way to handle this is to use asynchronous continuations. I.e. while queue is not empty consumer (long-running operation) just dequeues elements as usual, producer (main thread) just enqueues elements as usual.
However if consumer task finds queue empty, it creates continuation for the current operations and finishes (returns from task::execute()). Then when producer submits next element to the queue, it also spawns task to continue preempted operation. This way you can get around blocking in worker thread.

When I discovered that a TBB task running forever "uses up" one core all the time I changed my consumer-producer threadto exactly what you suggest. In principle one task handles all requsts in the queue as long as there are new coming in steadily but when the queue becomes empty it doesn't wait but quits.

The above works fine for my 3D window. Each task handles a few display updates when the user does something like moving the mouse to rotate thewindow content or something. Each update can vary between almost nothing up to maybe a second or so depending onhow much there is to display. Then when the user just watches thewindow no tasks are active.

But I also have anumber crunching simulation which is active duringlong periods of time. Here each task will live much longer. I want this tobe concurrent in order not to interfere with the GUI.

So even though no tasks are blocking anymore (but quits when there's nothing to do) I feel maybe tbb_threadis a better choise than TBB task for the situations I've described? Or maybe I should use TBB task for the GUI window and tbb_thread for the simulation.

ARCH_R_Intel · ‎12-07-2009

Quoting - uj

When I discovered that a TBB task running forever "uses up" one core all the time I changed my consumer-producer threadto exactly what you suggest. In principle one task handles all requsts in the queue as long as there are new coming in steadily but when the queue becomes empty it doesn't wait but quits.

The above works fine for my 3D window. Each task handles a few display updates when the user does something like moving the mouse to rotate thewindow content or something. Each update can vary between almost nothing up to maybe a second or so depending onhow much there is to display. Then when the user just watches thewindow no tasks are active.

But I also have anumber crunching simulation which is active duringlong periods of time. Here each task will live much longer. I want this tobe concurrent in order not to interfere with the GUI.

So even though no tasks are blocking anymore (but quits when there's nothing to do) I feel maybe tbb_threadis a better choise than TBB task for the situations I've described? Or maybe I should use TBB task for the GUI window and tbb_thread for the simulation.

With the current TBB, I'd be inclined to have the GUI thread be a tbb_thread (or the main thread) and have it never spawn TBB tasks directly, but instead put its requests for heavy lifting onto concurrent_bounded_queue. Set up another tbb_thread to service the concurrent_bounded_queue. The reason for using concurrent_bounded_queue here is that it supports blocking while the queue is empty. The service thread can then call TBB parallel algorithms or use TBB tasks without causing the GUI thread to stall. Notifications from the service thread back to the GUI thread could be done by injecting messages into the GUI's thread's event loop. (The latter idea comes from http://www.ece.auckland.ac.nz/~sinnen/articles/Giacaman2009ptp.pdf )

We are investigating extending TBB in a way that will let the GUI thread spawn a task (and not wait for it) and guarantee eventual execution of the task. That will simplify programming by eliminating the need for the explicit extra thread and concurrent_bounded_queue in the scheme above.

RafSchietekat · ‎12-07-2009

$4 "I've had itchy fingers about just such a proposal for a while, but I was waiting for something else to be integrated first..."
On second thought, the idea seems only deceptively simple, like task-based futures, so don't hold your breath just yet.

#7 "This will increase overhead so probably a thread-pool would be good but I'll wait with that until the next C++ version is here. It's included there right?"
There is not a single occurrence of "pool" anywhere in N3000 (dated 2009-11-09).

#7 "Regarding your comment about concurrent_thread. Are you indicating that it already works the way my producer-consumer queue works? That it doesn't spin but truely sleeps during periods of inactivity? I'm fairly certain I must have investigated this before making my own consumer-producer queue. Otherwise I wouldn't have bothered with that in the first place. Has concurrent_thread changed recently in this respect?"
Look for __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE, which appeared well over a year and a half ago.

#8 "I use term continuation in a more general sense."
OK.

Dmitry_Vyukov · ‎12-08-2009

Quoting - Raf Schietekat

#7 "This will increase overhead so probably a thread-pool would be good but I'll wait with that until the next C++ version is here. It's included there right?"
There is not a single occurrence of "pool" anywhere in N3000 (dated 2009-11-09).

Search by "std::async". There is a support for lightweight tasking in C++0x. Which evidently will be implemented by means of thread pool.

RafSchietekat · ‎12-08-2009

Quoting - Dmitriy Vyukov

Search by "std::async". There is a support for lightweight tasking in C++0x. Which evidently will be implemented by means of thread pool.

Oh my, are they standardising or innovating? If this is going to be C++0x instead of C++1x (hexadecimal of course), how confident should we be that the result won't be disappointment?

uj · ‎12-08-2009

>> @ Raf Schietekat #11
>> Look for __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE, which appeared well over a year and a half ago.

I've lookedat it but I don't quite know what to do with it. It seems to be anundocumented build directive that's not evenrecommendedfor use,

http://software.intel.com/en-us/forums/showthread.php?t=63522

RafSchietekat · ‎12-08-2009

"I've looked at it but I don't quite know what to do with it."
Well, if it's #define'd as 1, I would presume that at some point a change has been made to avoid busy-waiting in concurrent_queue, which seems to be what you want. The implementation seems to have been changing over time, but from what I remember from glancing at the code earlier it spins a number of times and then blocks.

"It seems to be an undocumented build directive that's not even recommended for use"
That may mean that the previous version is not maintained. At some point the dead code will probably be eliminated.

ARCH_R_Intel · ‎12-08-2009

Yes,__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE is an artifact of our development process, not a build option. We'll remove the code for !__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE -- it's unmaintained clutter now.

uj · ‎12-08-2009

Quoting - Raf Schietekat

"I've looked at it but I don't quite know what to do with it."
Well, if it's #define'd as 1, I would presume that at some point a change has been made to avoid busy-waiting in concurrent_queue, which seems to be what you want. The implementation seems to have been changing over time, but from what I remember from glancing at the code earlier it spins a number of times and then blocks.

"It seems to be an undocumented build directive that's not even recommended for use"
That may mean that the previous version is not maintained. At some point the dead code will probably be eliminated.

Well, you know I'm not a TBB expert.I came here for honest answers.

RafSchietekat · ‎12-09-2009

Quoting - uj

Well, you know I'm not a TBB expert.I came here for honest answers.

And how am I to interpret this?