I have two application which use TBB for parallelism. Due to security/stability aspects these two applications need to run in separate processes. Both are quite cpu intensive.
The problem with this is that if I just let TBB initialise with the normal thread count it will oversubscribe my system. One strategy I have tried is to set a specific amount of TBB threads for each device (e.g. 3 for one and 5 for the other). This works kind of well. However, it is suboptimal as the workload balance between the two processes is not constant.
What kind of strategies can I use to improve on this? e.g. is there a way I can measure TBB work loaded and dynamically change the number of threads between the processes using some kind of interprocess communication?
If you can periodically pop out of all parallel regions (outer most serial loop), in both programs, then you can use any number of inter-process communications (e.g. shared file, shared library global variable, ...) to indicate the number of such processes. Then re-init your TBB thread pool with an appropriate number of threads.
If you cannot do this, then at some convenient point you can schedule an independent TBB task that sleeps for some short period of time (giving up its cpu time). Only launch the number of such tasks that you wish to yield to the other process. Note, do not use sched_yield as you may not get the effect you want.
Both are cooperative schemes.
Would be nice to be able to dynamically increment/decrement the number of threads TBB uses.
However, those suggestion will probably work for us. The next question though is how to decide the thread balance between processes. Any ideas of any heuristics we can use to detect when it is time get or release threads from/to a process?
A simple technique may be all that is necessary.
In the outer loop of the program and/or wherever you might want to permit the (effective) thread pool to change sizes, you could periodically spawn a task that does something like: (pseudo code)
t0 = yourFavoriteTimer(); Sleep(0); // not sched_yield t1 = yourFavoriteTimer(); if(t1 - t0 > aThreshold) Sleep(yourYieldTime); // competing for CPU time, wait a bit
That is just a start. Each app may have different thresholds and yield times (or lack threreof). The above though should get you started.