As this is my first message here, let me start by praising TBB and saying how much I'm happy to have found this package! It has made life a lot easier for me already and I've just only scratched the surface.
I am working on an application that is a bit peculiar in that it does both want to have a fluid user interface and do a lot of background processing. It is a photo viewer/organiser which you can take a look at at http://www.gpuviewer.com, should you wish to know more.
Here is my problem:
- on the front, the app is displaying photos (which requires intensive computations for decompression of JPG/RAW)
- in the background, the app needs to walk your hard drive and do various tasks.
The background work is ideally suited to TBB, splitting work in small tasks is easy and organising them in "for" loops, pipelines and the like won't be a problem.
Now, what (I think) I need to do, when the UI needs a photo decompressed, is to pause the current flow of tasks, run the background decompression job as a task and then resume what was previously going on.
My guess is that I might be able to do this at the TBB scheduler level, somewhere in the logic where a thread decides where to go next when it is done with a task.
Ok, so here are my questions:
- has anybody faced a similar case before?
- does what I write above make any sense ?
- does anybody have any suggestion ?
Thanks in advance
TBB tasks are non preemptive, i.e. if a task started to execute, it will run to completion before the thread running it can start another task.
The way to set a sort of priority to a TBB task is to change its depth. There is a couple of methods in class task to do that. Increasing task depth makes the task to be taken for execution earlier, but it also makes it (and its descedants) be considered for stealing later than tasks with lesser depth. So it might not work ideally for you.
In the recent developer updates, there is a prototype of the new functionality that allows cancelling execution of a particular group of tasks. Soon we will release an updated version of it that should be more close to the final one. If your background tasks can be organized in a way that their execution can be safely cancelled and then a new invocation of those tasks can resume from the place where stopped, you might consider this technique. In case of more questions about it, please ask in a couple of weeks.
I'm starting to see the problem with using depth as priority... I didn't quite get my head around task stealing yet. I guess I should dive deeper into the innards of the scheduler and then come back to this, a couple of weeks may well have passed by then ;)
I created a "background work" thread that waits on a semaphore (semTodo), runs a tbb::pipeline when there is work to do and loops back to waiting on the semaphore, until app is closed.
When there is something to do, I run a tbb:pipeline with two stages:
- one serial job popping filter
- one parallel job running filter
When the second filter finishes, it signals a second semaphore (semDone)
The first filter actually waits for both semTodo and semDone. Returning the job for the next stage on semTodo and seeing if pipeline is empty on semDone, returning NULL if so (thus closing pipeline). [note: of course, before the pipeline is run, semTodo is signaled again, otherwise we'd lose count of one job]
I have set the pipeline inFlightCount pararameter to a relatively high value (32), so as to minimize time spent in the first stage (when there is a lot to do).
It works fine and exhibits the following:
- when there is nothing to do, the thread sleeps outside of TBB (ie TBB is idle, and totally available to the main app thread)
- when there is work to do, it is spread over all cores, very little time being wasted in the first stage
- any job request sent while pipeline is active will be honoured as soon as possible
- this plays well even if there is just one core
When other threads are using TBB too, I'm having the impression that things get done efficiently and that the contention involved is pretty reasonnable for a case of a UI with background tasks going.
I'm not sure this is the best approach to do background work with TBB, but it proved a good one for me.
It sounds like a perfectly reasonable approach, and similar to ideas I've seen before to activate TBB only when there's work to do to maximize the advantage of spin locks without leaving things spinning when there's no work to do.
Have you collected any data on scalability?
I reckon that things scale well to 4 cores, as the app felt noticeably more responsive when I did the TBB code above. I hear, it scales less well to 8 cores, and that doesn't surprise me all that much: photos have to come from the disk and that's always going to be my main bottleneck...
Perhaps I was overlooking some details, but I thought I understood J. Muffat's explanation. At the core is the use of a pair of filters, the serial input filter that gets called when the "background work" thread initiates the pipeline and passes the lump of decompression work along to the parallel filter, probably as a continuation task. While that thread occupies itself with work, the TBB task scheduler dispatches another pool thread to the serial input task. If semTodo indicates the availability of more work, the new thread repeats the process, stacking parallelthreads doing decompression work. If at the dispatch of any input task it should discover nothing waiting at semTodo, it could wait for semDone, or even spin, alternately checking both as suggested in the text above. Checking both would provide a bit of hysteresis for covering bursty incoming work. When both are empty, the current input filter returns NULL and the pipeline shuts down.
There may be some additional details to bulletproof the process. I'd imagine semDone to be more of an atomic reference count than a full blown semaphore, and probably would get incremented by the input filter and decremented by the processing filter. And there may be some additional details to ensure the background work thread doesn't interfere with semTodo while the sequence of input filters are active, but I do not see any holes in the basic process. Perhaps I'm missing something as well?
The pipeline should be closed immediately if input is exhausted (the overhead would be less significant than sacrificing a physical thread to waiting), notifying the background thread of the fact, so that it would spawn a new pipeline when new input arrives (assuming that it is really something suitable for a pipeline and not a misuse of it, otherwise parallel_while/parallel_do would be more appropriate). In general, spawned tasks should not get the actual work items (they would execute in the opposite order, wouldn't they?), they should continue executing until work is exhausted (to reduce overhead), and if possible multiply according to the work available (I guess parallel_while/parallel_do can take care of that), so far so good, but they should also never wait for more work to arrive (maybe real-time applications would be willing to sacrifice a physical thread for lower latency, I don't know, but it seems like a high price to pay and should be well considered, and I don't even know if this would be a suitable context for TBB?).