Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

TBB: Using task_scheduler_observer to set worker thread's OS scheduling priority

Tim_Day
Beginner
2,589 Views

I'm looking at TBB's task_arena and task_scheduler_observer.

The documentation for task_scheduler_observer sketches out a nice example of it being used to set thread affinity on worker threads to lock an arena's threads onto a subset of cores.

I'm curious to know whether this class and a similar pattern could practically be used to set OS scheduling priority for an arena.  What I'm interested in doing is, on my N core HW, creating an arena with N normal worker threads, and another arena with N threads on a lower OS scheduling priority.  However, the issue with scheduler priority is that generally you only get to lower it (unless running as root, but assume not), and it's not clear to me to what extent TBB worker threads move around between arenas (which would defeat the object of keeping all the low priority threads in one arena); the task_scheduler_observer docs mention returning false from on_scheduler_leaving() to keep a thread in an arena... but also mentions the possibility of it not being called if a thread is moved e.g for rebalancing.  On the other hand in the affinity example if threads with a affinity mask set migrated out of the pinned arena into the general pool, that'd seem to be undesirable too but there doesn't seem to be any mention of it as an issue so maybe it's not a problem in practice for some reason?

BTW I've used two thread pools (homegrown ones) with different OS scheduler priorities in interactive applications before and found it useful for separating and prioritizing the tasks which are crucial to maintaining application responsiveness from those which aren't.

0 Kudos
5 Replies
Tim_Day
Beginner
2,589 Views

Been playing around with this a bit.

The observer's on_scheduler_leaving() turned out to be a bit of a distraction as it only had a brief window of existence and was removed in a recent TBB update... so there's no way of locking a thread into an arena.

However, it does seem to be entirely possible to control the OS scheduling priority of an arena's workers provided you take care to modify it up and down on both of the observed on_scheduler_entry and on_scheduler_exit.  On Linux that does mean having a process which can adjust RLIMIT_NICE (doesn't necessarily mean running as root - which is what I've been doing - apparently using setcap to enable CAP_SYS_RESOURCE on an executable will do the same job.  Believe Windows also has some priority increasing shenanigans needed.).

Probably the biggest practical issue I see is that if you init TBB for 2xN (N=number of cores) and create a N thread "background" (OS scheduler priority lowered) arena, you'll find stuff running on 2N threads when the background arena isn't in use.  So you end up creating another N thread arena for foreground (normal OS scheduling priority) and have to remember to execute everything else in it.  Apart from that it all behaves as expected though: when there are tasks for the foreground, the background ones get out of their way, and when only one of foreground or background is active, it gets all the cores without over subscription.

0 Kudos
Anton_M_Intel
Employee
2,589 Views

Thanks for sharing the use-case! Your findings are good.

Let me suggest an approach to avoid explicit arena for foreground:

int main() {
    int P = tbb::task_scheduler_init::default_num_threads();
    tbb::task_scheduler_init tbb_scope(2*P);  // must be the 1st access to TBB
    tbb::task_arena background_arena(P);     // high-priority arena, 1 master
    background_arena.initialize();          // forces arena allocation
    tbb_scope.terminate();                 // removes implicit arena of unwanted size
    // but keeps the oversubscription due to additional reference from task_arena
    tbb_scope.initialize(P);             // the arena for normal work
    // .. add observer or so
    // Keep the rest of main() code intact, e.g.
    tbb::parallel_for(1, 100, [](int){ puts("Hello world"); });
}

 

0 Kudos
Tim_Day
Beginner
2,589 Views

Thanks Anton, that's a very nice useful tip!  Code will be much cleaner if TBB can continue to be used as normal for non deliberately "background"-ed work. My assumption until I saw the above was that the terminate() would have cleaned up all the TBB worker threads; I didn't appreciate the ones in the arena were "protected".

0 Kudos
Anton_M_Intel
Employee
2,589 Views

Tim Day wrote:
 My assumption until I saw the above was that the terminate() would have cleaned up all the TBB worker threads; I didn't appreciate the ones in the arena were "protected". 

It's not like threads belong to an arena. There is single shared thread pool in TBB. It is just protected by reference-counter and thus when you create the second arena, it is safe to destroy the first one without loosing desired properties of the thread pool ("Market") because there is still one reference from another arena.

0 Kudos
Tim_Day
Beginner
2,589 Views

(Sorry, just realized now there is actually a completely separate forum for TBB).

Anyway, been playing with this "background" arena of low priority worker threads some more...

On Linux it works exactly as I'd expect it to, even running the low priority workers at "nice 19".  The background tasks can be pretty much got out of the way when there's better things to do, and get to run freely when there's nothing on the normal priority "implicit" arena.

On Windows it generally behaves well... except that sometimes (quite infrequently) low priority (only) long running background tasks which are signalled (just by a bool flag) from another thread to stop seem to be "shut out" and don't execute the test in their loop that they should finish now until several seconds (highly variable) after they'd have been expected to notice it, even though the same "stop now" abort flag also stops all the normal priority threads so there aren't other tasks executing.  Need to dig a bit more before I can show a minimum reproducing example but my current guess is that on Windows something is spinning at normal priority and blocking the low priority threads from running (ok I don't have nearly as much experience with windows priority system as I do with Unix/Linux so I may just be expecting too much from it.  Issue seems to occur with both THREAD_PRIORITY_LOWEST or THREAD_MODE_BACKGROUND_BEGIN/END on the low priority arena threads ).

 

0 Kudos
Reply