Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2464 Discussions

! area_in_need() spinlocking and hogging 100% CPU

MLema2
New Contributor I
690 Views
I have a process running 100% on 24 cores! Dumping the process reveals 23 TBB threads are spinlocking within area_in_need!

arena* arena_in_need () {
spin_mutex::scoped_lock lock(my_arenas_list_mutex);
return arena_in_need(my_arenas, my_next_arena);
}
Typical callstack:
1 tbb.dll!__TBB_machine_cmpswp1()
2 tbb.dll!tbb::internal::market::arena_in_need()
3 tbb.dll!tbb::internal::market::process(rml::job & j={...})
4 tbb.dll!tbb::internal::rml::private_worker::run()
5 tbb.dll!tbb::internal::rml::private_worker::thread_routine(void * arg=0x000000001f1be7d0)
6 tbb.dll!_callthreadstartex()
7 tbb.dll!_threadstartex(void * ptd=0x0000000000000000)
No other thread is running tbb code or tasks..
It might be useful to specify we use task priorities through the optional context parameter.
any ides as to why it does this and suggestions on how to fix it?
NOTE: we are using TBB 4.0 U2 (OSS278 of last december)
0 Kudos
5 Replies
MLema2
New Contributor I
690 Views
0 Kudos
Anton_M_Intel
Employee
690 Views
Yes, it can be related to task priorities..
Could you provide us a reproducer or sketch the structure of your code here?
Do you use task:enqueue (with priorities)?
When you say 23 threads are busy doing arena_in_need(), what 24th thread is doing? Is it a master thread? How it submitted the work to TBB?
0 Kudos
MLema2
New Contributor I
690 Views
I tried a few hours to create a test case to reproduce this problem without success..
We are running a mix of parallel_for_each, parallel_for and parallel_sort and pipelines with either high priority set on the task_group_context or in normal mode with default context.
As for the 24th thread.. I might be mistaken but does TBB creates only P-1 threads where P is the number of logical processors? As I mentionned in my original post, there is no code running tbb algorithms at the time of the coredump.
Looking at the code... could it be that the list of arenas get very long and iterating through them in area_in_need could take longer than expected while holding the spin mutex? I currently only have a minidump of the process that had the problem so I can't see if it's the case.
0 Kudos
MLema2
New Contributor I
690 Views
Is it possible that too many thread fighting for that mutex would make them spinlock uselessly?
0 Kudos
MLema2
New Contributor I
690 Views
I have a full user dump of a process showing the issue..
I could certainly execute some commands in windbg and send you the result if it could help diagnose the issue.
Many thanks
0 Kudos
Reply