The multi-threaded program makes a pure use of TBB to implement its parallelism.
Suppose itwill be running on a four core cpu. It involvesa large amount oflow speed IO and some heavy computation. How can we assign one core[any one of the four cores] to perform all the IO action and the rest cores to do the computation?
If it is absolutely 100% very much sure it is definitely impossible, please confirm the impossiblity. Otherwise, please specify how to do it in detail, and describe which component of TBB, like pipeline/task object, is to be used to solved this problem.
Thank you very much for your help
Intel Threading Building Blocks takes a specific approach to the application of parallelism to application code, trying as much as is possible to abstract the notion of processors intoa generic resource, a pool whose size may vary from execution to execution and evolve to larger numbers as time goes on, a pool better left to be scheduled dynamically depending on the resources available and the moment-by moment parallelism exposed in the application. Placing constraints such as designing for a specific number of cores or requiring that one of those cores does all the I/O can seriously limit Intel TBB's ability to do that dynamic scheduling and limit the potential scaling your application might be able to achieve.
Consider, for example, the Intel TBB goal to maximize local processing to make best use of data already in cache. Going to a single processing element for all the I/O guarantees that the computational PEs will have to pull data into their local caches from memory, whereas in an organization where each PE participates in I/O for computation that it might do, there's a chance some of those data may still be in a local cache (depending on how the I/O is managed and executed) and be available for computation faster than if they had to be pulled in from memory.
The parameters for the task you've outlined above are sketchy enough that it's hard to be more specific about a design. Is the I/O one input and one output file? Or multiple files? Is the I/O simple streaming through data, or is there some random access I/O required on one or more open files? Are the data ordered (requiring, for example, First In-First Out sequential processing) or can they be processed in any order? Is there some hardware reason why all I/O should be done by a single core or is that just an expectation of the current design?
On computational processing of streaming I/O, TBB has demonstrated good scaling in several applications, using among other things constructs such as the pipeline. Some results I blogged about last year show very good scaling by allowing each PE to repeatedly read, process and write its portion in sequence. So while it may be difficult to squeeze TBB into the shape you have in mind for your current design, thinking from a direction of TBB's strengths may lead you to a different and more efficient, scalable design.
while there are ways in TBB to approach the design you possibly have in mind (with one thread on a single core doing all IO and other threads doing computation), I agree with Robert that such a design should better be considered as last resort after no better (more scalable and HW-agnostic) design were found.
If you provided just a high-level view of the problem and how you approach it, might be someone could help with design ideas. Even suggesting how to do the separate IO thread right would require some additional information, such as how much do you expect this thread to load the core it runs on, in order to understand whether it makes sense to share this core with a computational thread.
Yes, exchanging data items via concurrent_queue as Raf suggested is one of possible designs if you believe you need a separate thread. On the other hand, tbb::pipeline used as outlined by Robert in his blog might be an alternative, unless your I/O has some context associated with a single thread. The I/O thread can either be borrowed from the TBB pool by spawning a long-running task as Raf suggested, or it can be explicitly created; and for the latter, TBB now provides a tbb_thread wrapper class which interface follows std::thread proposed for C++0x as close as possible and practical at the moment (we would not even bother with doing this class if std::thread was ubiquitos).
Raf_Schietekat:The trick is to initialise the task scheduler with one more thread than is physically available, for which there is no TBB-pure interface (you should hard-code it or get the number in an O.S.-specific way)
Raf_Schietekat:Leave it to the TBB and O.S. schedulers to decide what executes where (maybe the I/O code will stick to a specific core, maybe it will get juggled around a bit, but it probably doesn't really matter).