1. Is there a penalty for using boost Shared pointers with the scalable allocator? I know that shared pointers require an extra level of indirection, but I am curious if the atomic ref count in shared pointer has an impact? I am using this in a pipeline btw.
2. Is there anything similar to the boost threadpool in the tbb world (see http://threadpool.sourceforge.net/). I have used this product with TBB without any problems but am not sure if this is the fastest solution. Would it be better to reimplement this with tbb?
There is no any special penalty for exactly this combination... probably you wanted to ask some other question...
Nope. Storing boost::shared_ptr is exactly the same as stroing raw pointer.
However, boost::shared_ptr requires additional memory allocation for counter. And this is a problem. If you need performance you must consider switching to boost::intrusive_pre. shared_ptr is a dump tool for prototyping.
Atomic ref count can potentially have HUGE performance impact. Basically total performance and scalability destruction. But can have no impact at all. It depends.
Of course! The thread pool is one of key components of TBB scheduler. TBB is a thread pool, from some point of view.
If your tasks don't execute blocking operations, then you can use TBB as-is. Otherwise you have to setup number of threads in TBB thread pool to something like number_of_processors * 32 or so (requires tweaking), however it's better to eliminate blocking from tasks.
Would it be better?.. It depends on size of your tasks.
You want to instantiate your own thread pool? In TBB you use tbb:task_scheduler_init to construct an object which contains a thread pool. TBB threadpool threads are as useful for long running or blocking tasks as any other thread; however assigning such threads to such tasks takes them out of the mix for dealing with other scheduled tasks. Is it that you want multiple thread pools? Doing so runs the risk of overcommitment, which can lead to thrashing and other impediments to performance. If you need a couple threads in the pool for blocking, you could always bump the count of the number of threads created by the task_scheduler_init, knowing that when those extra threads are not blocking, they might be competing for resources with other threads.
Can you describe more your need for an explicit thread pool?
I think there is a way to do this in TBB. Use a single pipeline that has only a single stage. The stage should be parallel and process an entire file.(TBB 2.1 made some fixes so that a parallel input stage really runs in parallel).
The input to the pipeline should be the list of files. Set the max_number_of_live_tokens parameter to the thread limit that you want. That will give you FIFO processing, and limit the number of files being processed at any moment
If the list of files is itself a serial stream, then use an initial serial stage to pop file names from the list, and feed each name into a subsequent parallel stage that processes the file.
Quoting - tbbnovice
Thanks. Let me explain what I want to do. I am working on something like the capitalize-words-in-chunk example in the book, except I have multiple files to read from (there is no need to merge the files, so the pipelines are completely independent). However, the first stage can block on I/O because it reads from a file, so I want to instantiate each pipeline in its own thread (earlier, I was thinking of using a parallel_for but because these are long-running+blocking tasks, I heard on this forum that I should not be doing that).
Itmight be controversial toearlier adviceyou heard (including possibly my own), but for the described task I'd first try parallel_for over files (usingsimple partitioner (default) and grain size of 1 file) for outer level parallelism, and pipeline at inner level. Well, I must add "unless you really need to have simultaneous progresswith processing each file".
Then if you see that the system is undersubscribed due to blocking I/O, try oversubscribing it a little, by initializing TBB to use more worker threads. Ask the default number of threads (there is a static method of class task_scheduler_init), add somewhat more, or possibly multiply by some factor (depending on what the typical loadis expected to be- e.g. for 50% load, try multiplying by two). You will likely end up being oversubscribed for some periods (not too bad, especially if each file is processed independently of others), possibly undersubscribed for some other periods, and just fine at lucky periods :)
I think dealing with thread pool will have essentially the same effect, but possibly with more efforts. By starting every pipeline in its own thread, you effectively add one more master thread working with TBB, and it has almost the same effect as adding one more TBB worker thread -because every master will run a TBB scheduler and so potentially participate in completing the work of others. If you use thread pool together with TBB initialized by default, you oversubscribe the system - just as if you added more TBB workers. And so on.
I also like Arch's idea of using pipeline at the outer level. Still, if the amount of files to process could vary, as well as the amount of work in each file (in particular, if there are just a few files with uneven work in each), I'd add inner-level parallelism, i.e. process each file by another(inner) pipeline started from the parallel filter of the outer pipeline.
To answer the question about a boost-like thread pool in TBB: for the moment, there is no such thing.