Problem: We are trying to use the tbb::task_scheduler_observe class extensions (preview feature) to pin a couple of worker threads to a specific tbb::task_arena after changing the affinity and priority of those specific worker threads entering this dedicated real time arena. The mechanism works well as long as we are not changing the worker threads trapped into the arena to real time priority. When changing the worker threads priority to real time to solve our latency issue (note that we only do so for the couple of worker threads trapped into that specific real time arena), this makes the system freezes. I presume that the system freezes because those worker threads now at real time priority are constantly trying to leave the arena via the tbb::task_scheduler_observe::on_scheduler_leaving() where we always return false as per the sample code.
Question: Is there a way to have worker threads dedicated to a specific arena so that they are trapped and can never leave the arena without them trying to leave the arena all the time? Like Jesse Pinkman in Breaking Bad.
Actual problem we are trying to solve here: We are currently optimizing the performances of our application to have outstanding real time playback. We are already leveraging the Intel Threading Building blocks in various parts of the application. But in this particular case, we have a latency performance issue that cannot be solved by the priority scheme associated with the task_group_context. We need a couple of dedicated worker threads set at real time priority otherwise they compete with numerous other threads in the background that are working on what is going to be displayed in the future - decoding threads, processing threads, etc. In that specific scenario, the future can wait. Those dedicated worker threads don't have much to do but they have to react fast otherwise the playback could possibly drop a frame. Something we don't want. We are running on high performance dual socket workstations. We have the power. It is the low latency we need.
Many thanks in advance,
the overall idea of isolating a real-time priority work in a special task arena makes sense to me. However, trapping threads to this arena looks much less "nice", and I'm not surprised that it did not work well. The weak point is that trapping a thread inside an arena means spin-waiting, as worker threads do not sleep while in arena. It's by design; a thread provider is external to the task scheduler and could even be a 3rd party component (such as MS ConcRT), and it might have its own plans to idle threads - so instead of sleeping on a semaphore, worker threads return to the provider. So trapped threads do busy-waiting, and obviously if these threads have real time priority the system has little chances to do anything useful.
Also I should tell that in the latest updates we removed that "trapping" on_scheduler_leaving() callback from arena observers. Instead we suggested may_sleep() callback for global observers, which could prevent workers from falling asleep (more correctly, from returning to the thread provider) when there is no work. The targeted use case is to prevent premature sleep (and associated overhead & latency increase) when the aplication is about to submit a new portion of work really soon. It's also a preview functionality, and there is no guarantee it will survive. Anyway, I guess this new callback is of no help for your use case.
We will discuss your use case internally, and get back to you soon.
Thank you so much for the excellent reply. Any advice on how we could approach this use case differently while still leveraging the Intel TBBs would be very much appreciated.