Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
2421 Discussions

Sort two arrays simultaneously using TBB sort



The problem I am having is as follows,

1) I have arrays A and B of size N.

2) I have to sort A and also make B in the same sorted permutation as that of A.

This is how currently I have implemented in my code using TBB sort. But I find that the in-place re-ordering algorithm is a serial implementation and is a major bottleneck in my code. How can I do this more efficiently using only TBB sort and possibly avoid the use of permutation array ?

std::vector<unsigned> order(N);

std::vector<double> A(N), B(N);


for( unsigned i = 0; i < N; ++i ) order = i;


CompFunctor compare(A);

tbb::parallel_sort( order.begin(), order.end(), compare ); /// Runs in parallel

InPlacePermute( &A[0], &B[0], &order[0], N ); /// Runs in serial


/// Comparison functor definition

struct CompFunctor {

  CompFunctor( std::vector<double> &a )

  : _data(a) {  }


  inline bool operator()( uintT i, uintT j ) const {

    return _data < _data;




  std::vector<double> &_data;



/// Inplace permute

  template < typename T1, typename T2 >

  void InplacePermutation( T1 *data1, T2 *data2, unsigned *perm, size_t size ) {

    T1 temp1;

    T2 temp2;

    unsigned j, k;

    for( unsigned i = 0; i < size; ++i ) {

      if( i != perm ) {

        temp1 = data1;

        temp2 = data2;

        j = i;

        while( i != perm ) {

          k = perm;

          data1 = data1;

          data2 = data2;

          perm = j;

          j = k;


        data1 = temp1;

        data2 = temp2;

        perm = j;




0 Kudos
6 Replies
Black Belt

Why don't you write a parallel algorithm to do the permutations to different vectors and then swap the results?

You might also be able to do some black magic with a special iterator type, so that both vectors are sorted at the end of calling tbb::sort, but that seems a bit far-fetched.

Creating new vectors consume a lot of space and I don't want that as the vectors are huge. The black magic will be to use boost zip_iterator but will that work with TBB sort ?

There is no ready-to-use solution for this problem for tbb::parallel_sort. Actually, as far as I can tell after a bit of digging in the Web, there is no ready-to-use solution for this problem even for std::sort. And tbb::parallel_sort uses std::sort inside for sequential code, so you would need a special iterator that works with both.

Let me give you a couple of links that I have found with some ideas or hand-made prototypes that might be helpful:

My recommendation is to first develop (or borrow) something that works with std::sort, and then try to use it with tbb::parallel_sort.

Black Belt

I latched on to "But I find that the in-place re-ordering algorithm is a serial implementation and is a major bottleneck in my code.", so I didn't know you would favour the other approach because of space restrictions. Luckily it has already been implemented, as Alexey has found.

If these arrays are so huge, I wonder if it wouldn't be cheaper to really zip them first, sort the zipped array, and then unzip it again. That does add O(n) storage and time, but maybe you'll more than gain that back in improved sorting performance because of the cache, i.e., because a fraction of O(n*log(n)) might still be larger than the O(n) above. Does that sound plausible at all?


Maybe you can initiate a struct array that contains an element from each array, then sort that?

struct packed {
    int array_0;
    int array_1;

packed packed_data = malloc(N);
// initialize packed_data with data

sort(packed_data, [](&a, &b) {
   return  a.array_0 > b.array_0;

// now for each packed_data element, array_0 is in order, and array_1 is in order



Just did that with std::sort (but it should work with tbb::parallel_sort). It involved a bit of work, but you can inspire yourself from this :

The idea is to make a kind of "zip iterator" that would be compatible with the tbb::parallel_sort interface.

Here is an example of usage (adapted from line 182) :

tbb::parallel_sort(make_sort_permute_iter(container1.begin(), container2.begin()),
          make_sort_permute_iter(container1.end(), container2.end()),
          sort_permute_iter_compare<container1_type::iterator, container2_type::iterator>())

Hope that will help !