Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
2452 Discussions

! area_in_need() spinlocking and hogging 100% CPU

New Contributor I
I have a process running 100% on 24 cores! Dumping the process reveals 23 TBB threads are spinlocking within area_in_need!

arena* arena_in_need () {
spin_mutex::scoped_lock lock(my_arenas_list_mutex);
return arena_in_need(my_arenas, my_next_arena);
Typical callstack:
1 tbb.dll!__TBB_machine_cmpswp1()
2 tbb.dll!tbb::internal::market::arena_in_need()
3 tbb.dll!tbb::internal::market::process(rml::job & j={...})
4 tbb.dll!tbb::internal::rml::private_worker::run()
5 tbb.dll!tbb::internal::rml::private_worker::thread_routine(void * arg=0x000000001f1be7d0)
6 tbb.dll!_callthreadstartex()
7 tbb.dll!_threadstartex(void * ptd=0x0000000000000000)
No other thread is running tbb code or tasks..
It might be useful to specify we use task priorities through the optional context parameter.
any ides as to why it does this and suggestions on how to fix it?
NOTE: we are using TBB 4.0 U2 (OSS278 of last december)
0 Kudos
5 Replies
New Contributor I
Yes, it can be related to task priorities..
Could you provide us a reproducer or sketch the structure of your code here?
Do you use task:enqueue (with priorities)?
When you say 23 threads are busy doing arena_in_need(), what 24th thread is doing? Is it a master thread? How it submitted the work to TBB?
New Contributor I
I tried a few hours to create a test case to reproduce this problem without success..
We are running a mix of parallel_for_each, parallel_for and parallel_sort and pipelines with either high priority set on the task_group_context or in normal mode with default context.
As for the 24th thread.. I might be mistaken but does TBB creates only P-1 threads where P is the number of logical processors? As I mentionned in my original post, there is no code running tbb algorithms at the time of the coredump.
Looking at the code... could it be that the list of arenas get very long and iterating through them in area_in_need could take longer than expected while holding the spin mutex? I currently only have a minidump of the process that had the problem so I can't see if it's the case.
New Contributor I
Is it possible that too many thread fighting for that mutex would make them spinlock uselessly?
New Contributor I
I have a full user dump of a process showing the issue..
I could certainly execute some commands in windbg and send you the result if it could help diagnose the issue.
Many thanks