Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Announcing Community Preview (CP) Features

David_Sekowski__Inte
1,144 Views

The Intel TBB team is happy to introduce the use of Community Preview (CP) features into Intel TBB today. CP features are a great way for Intel to show new and interesting capabilities to our community and customers before they have been finalized. As part of our commitment to openness with all of the Intel Parallel Building Blocks technologies, we want our users to know what we are working on to make Intel TBB better. We also want to gain your feedback on upcoming features so that we can make sure we continue to meet your needs today and in the future.

These features are fully tested but are not officially supported or necessarily fully documented. Given the early nature of these features, we dont guarantee that they wont be removed or modified in ways that break compatibility with pre-production versions. In addition, they are turned off by default so they wont impact your application unless you want to give them a try. We look forward to hearing your response in the forum to our first CP feature, the Concurrent Priority Queue included in Intel TBB 3.0 update 4, and to the use of CP features in general.

0 Kudos
9 Replies
nagy
New Contributor I
1,144 Views
Any plans to add a blocking pop() method to the priority_queue?
0 Kudos
Dmitry_Vyukov
Valued Contributor I
1,144 Views
Quoting nagy
Any plans to add a blocking pop() method to the priority_queue?

You can easily wait for items become available in a priority queue with an eventcount:

http://software.intel.com/en-us/forums/showthread.php?t=62364&o=a&s=lr

Currently it sits in tbb/src/tbb/concurrent_monitor.h

0 Kudos
Alexey-Kukanov
Employee
1,144 Views

We think we should not add a blocking pop. Here is why (courtesy Arch Robison):

A busy-waiting pop causes unacceptable performance issues. A properly blocking pop adds much complexity, and adds overhead for users who do not need it. Replacing busy waiting with proper blocking is generally hard to do, unless you put mutexes/condition-variable operations on the hot paths. To keep mutexes/condition-variables off the hot path, you end up needing something like concurrent_monitor.

Our position (significantly influenced by Microsoft's PPL) now is that synchronization for waiting should be done in the application, not in the container. Thats why tbb::concurrent_unordered_map has no blocking operations and tbb::concurrent_queue was revised to remove those.

0 Kudos
Alexey-Kukanov
Employee
1,144 Views
0 Kudos
Dmitry_Vyukov
Valued Contributor I
1,144 Views

We look forward to hearing your response in the forum to our first CP feature, the Concurrent Priority Queue included in Intel TBB 3.0 update 4

Hi!

I've shared my thought on concurrent priority queues here.

0 Kudos
ARCH_R_Intel
Employee
1,144 Views
I was unclear on how to post my comments to the blog, so I'll post them here.

Back in summer of 2005, before TBB was released, I had an intern work on implementing concurrent priority queues in TBB using skip lists. The results were dismal, as the overhead of atomic operations clobbered any speedup from scaling, so we shipped TBB without concurrent priority queues. Since then, other developers have tried other concurrent priority queue designs from the literature, and they similarly have been dismally swamped by the overhead of atomic operations.

Dimitriy's blog succinctly identifies the key problem: global consensus is required, so true scalability is inherently impossible. Given that inherent limitation, Terry Wilmarth's design for concurrent_priority_queue is a clever angle we had not considered before: if we can't beat the bottleneck, we might as well make it efficient. Itseems to do well in practice if one really needs a concurrent priority queue. Though I agree with Dimitry that it is a last resort - the alternatives in his blog should be considered first before falling back on concurrent_priority_queue.

The discussion of building a distributed queue based on stealing raises an interesting design question. Is there a sensible way to define a container whose purpose is to support stealing? It seems to be a recurring design pattern that currently has no direct support in TBB other than the task scheduler itself.
0 Kudos
Dmitry_Vyukov
Valued Contributor I
1,144 Views
> I was unclear on how to post my comments to the blog, so I'll post them here.

Yeah, I am still fighting Google Sites. On some fronts he is winning...

> The results were dismal, as the overhead of atomic operations clobbered any speedup from scaling

Devil in details. Atomic operations significantly fell in cost since then. Current tbb CPQ design have AFAIR 2 contended atomic RMW per operation. I think that skip-list also need 2 contended RMW / op (1 + 0.5 + 0.25 + ...) + [optional] 1 local RMW for node lifetime management. So I do not see significant difference.

> the alternatives in his blog should be considered first before falling back on concurrent_priority_queue.

I suspect that TBB users will get it other way around - there is cool *concurrent* queue in TBB, so what else I need?

> Is there a sensible way to define a container whose purpose is to support stealing?

Take a look at .NET ConcurrentBag - unordered collection of objects which supports put and get.

0 Kudos
Andrey_Marochko
New Contributor III
1,144 Views
Update 6 to TBB 3.0 adds support for task and task group priorities. This is a major extension to TBB task scheduler functionality available as a Community Preview feature. Everyone interested are welcome to read the corresponding blog that describes intended usage models and explains why particular design decisions were made.
0 Kudos
Reply