There may be a way to officially do this with TBB (like ippSetAffinity or KMP_AFFINITY=)
If not, then you can (by other means) set the process affinity mask to the logical processors representing one per core. Do this prior to starting the TBB thread pool. Then start the TBB thread pool with half the threads. Look at the TBB init for controls affecting affinity and/or related settings.
In QuickThread you can use
parallel_for(OnEach_L1$, fn, from, to, arg, arg, ...);
Which will establish a thread team using one thread per core even with HT enabled.
"I thought about hard wiring the number of threads. In my case the problem's size is not know at compile time. I will probably shift the burden to the users and let them to set the number of threads via config file or environment variable."
That's not what I meant. If it's a real issue, you should be able to demonstrate it with hard-coded values, otherwise it's imaginary and probably not worth any further discussion or exploration. So which is it?