I need to use a concurrent priority queue for extracting min-values elements.
I know that in my domain, during a pushback of an element it is assured that the element's value is greater than all values of elements in the queue prior to the pushback operation.
Can I assume from this that the pushback operation will have O(1) complexity? (similarly to the classic array-based heap implementation, where a new element is inserted at the next free position in the array, and then a "heapify" operation cascades it up the heap if necessary).
The current implementation of tbb::concurrent_priority_queue internally uses a heap, so that is indeed what you should be seeing right now when inserting elements sequentially in FIFO order, but how would that be helpful to your presumably concurrent application? It's not a documented contract, in a concurrent application the premise wouldn't hold, and the total complexity per element is dominated by the logarithmic time needed for extraction.
Hi, thank you for your reply.
Since most of my real time contention happens with many push-operations, and I know I always insert max-elements, it is important for me that these cost as little as possible.
I'm willing to pay the price of a bulk of pop-operations at some later time (which happens only once in a while), when the contention is much lower.
For real time, you should probably keep in mind that O(1) is amortized complexity, because the internal buffer might be reallocated if it was not made large enough from the start, and there's no separate reserve() operation that you could invoke at a "safe" time.
But if you really only have a single producer that always inserts elements in FIFO order, why not use a concurrent_queue instead?
I need the elements ordered by some key, because some of those I pop I immediately re-insert, by ordering of the key.
Some of them will have an "actual" key value set aside as a member in the element, beside the current key.
Upon pop, if the current key and actual key differ, I sync them (by setting the current key to the actual key's value), and re-insert the element.
Only a relatively small portion of the pop operation elements will be off-sync...
This is the caveat to the majority real-time inserts which are always at the tail (i.e. max-valued).
Generally, I could probably get better performance with some well-implemented concurrent AVL-tree, or some other concurrent balanced BST, but I could not find such implementations.
This picture is still not quite clear. Is there just a single FIFO producer, or just most of the time, with occasional concurrent reinsertions, like your latest post seems to indicate? Will the new insertions still be ordered behind elements re-inserted by the consumer? How many consumers are there (seemingly single, but perhaps not)? If you don't disclose all relevant information, asking a question might just be an exercise in futility for both of us.
Anyway, you're not obliged to do everything with one pre-existing container: if you have a single consumer, you could combine a tbb::concurrent_queue with a consumer-side std::priority_queue or so, which might even be better for real time because earlier I had overlooked the fact that in a tbb::concurrent_priority_queue a push() might still be blocked behind a logarithmic-time try_pop().
I'll to be clearer.
The container needs to be ordered and concurrent.
There are multiple producers that do tail-inserts. By this I mean that each such inserted element is guaranteed to have a key with value greater than all other keys in the container at the time of insertion.
There is a single consumer. Each element it extracts is either with an up-to-date key, or an out-of-date key. In the former case, the element is consumed and does not re-enter the container. In the latter case, the element's key gets updated, and the element is reinserted. In this case, the key will not be greater than all others in the container (as in the producers' inserts), and so it has to be inserted in the correct location of the ordered container.
One other improvement I'm thinking of is using another priority-queue to temporarily store the updated elements, instead of reinserting them into the main priority-queue (of size n). This way, for a bulk of k extractions by the consumer, in the worst case they are all out of date, and populating the temporary priority-queue will cost O(klogk) instead of O(klogn).
In this approach, the pop() operation needs to be wrapped with a comparison with the top element of the temporary queue, to know which one is the actual next element.
I'm not sure if this will truly be better because of overheads due to code complication and maintenance, bugs, and additional costs of the extra container.
The most important thing to me is for the frequent producers' inserts to be fast, hence my original question about the underlying implementation.
Thanks for the clarification. If you can guarantee that each producer always inserts a greater value than the ones already there, then you must have some kind of external synchronisation between them, and then they collectively count as one producer... except of course that it won't be the only producer if the concurrent consumer also wants to re-insert some elements.
Whether or not you have the above guarantee, you should use a concurrent_queue between the single producer and consumer, because it has better concurrency characteristics. You can then either put the consumer-side priority_queue "in series" with the concurrent_queue, or, if the values passing through the priority_queue are indeed ordered because of the above guarantee, only use the priority_queue for the updated elements. Don't worry about complexity, which should be minimal, and the outcome should easily be worth it (unless there wasn't much need for optimal performance in the first place). You could always try several approaches, compare them, and let us know!