Intel TBB pipeline vs producer/consumer pipeline when blocking I/O
Hi, I've discovered the intel TBB. For my application I have to do a processing pipeline composed of filters... So looks like a lot like pipeline feature of TBB library. On one end of the pipeline, I've got a I/O blocking feeder that must wait from I/O. On the other end (at the end), I've also got an I/O blocking processing wich sends result of the pipeline throught the network. I was first thinking of a producer/consumer architecture, one thread for input I/O, a queue, n thread for computing, n queue, on thread for output I/O. So the idea is to not be too much limited by I/O performance.(tipycally there will be a minimu if 3 thread and a max of ~5). With TBB, this is the same concept but if i well understood, only one buffer is managed throught the pipeline. So it will not be efficient in my case no?(In fact i will on ly use filter as serial filter...) Perhaps is there a better way to use TBB library for my case? (A mix of pthread + intel concurent queue??)
Your approach to the use of Intel TBB pipelines is perfect. In fact, I've been writing about a similar application in a series of blog articles that begin with this one. To control the number of buffers, TBB provides a token count that is supplied as a parameter in the pipeline.run() command. It is up to the code to actually manage the buffers, but the token count controls how many tasks are floated down the pipeline at the same time (i.e., how many times the serial input filter will be called before any of the returned buffers are passed to other stages of the pipeline).
Thank you robert, I better undersatnd how tbb pipeline works, thx to your blog pictures. So in fact there is only one buffer per pipeline but when a buffer is processed a new token can be created and then a new thread with a new buffer will do the next job. Clever. But, what i don't see is how all the data are handled at the end... For example, if I've got a final filter that send the result via a tcp/ip link, and 2 pipeline 0 and 1 that works in parrallel, 0 was the very first data of my processing.
How can I be sure that pipeline 0 will send it's result before pipeline 1 (that was instanciated latter but that can finish early)???
Does TBB lib handle this? If yes it's cool and quite interesting way of computation because as you said in your blog: "pipeline data represents a greater weight of memory movement costs than the code needed to process it". So now I understand why my first approach of producer/consumer scheme with lot of queue can be inneficient if lot of memory is processed...
Thank you a lot :), I think i'll finally use TBB ;)
Well, the diagrams were meant to represent one pipeline with multiple tokens flowing down it, so perhaps my diagram needs a little more work to be clear . Multiple buffers can be moving down the pipe simultaneously (or should I say the pipeline is moving past the buffers?), just as there are multiple tokens to represent available slots in the pipeline. Each stage (filter) of the pipeline can be declared as parallel or serial. Parallel filters may get tokens (buffers) in arbitrary order; serial filters get tokens (buffers) in canonical sequential order, which avoids your pipeline reordering issue. Using a serial filter as the last pipeline stage should avoid any reordering problems. You will still need to deal with the buffers. The most obvious solution would be to queue them for reuse at the head of the pipeline.