- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

[cpp]void calculate_upper_triangular_matrix_elements(Matrix& m) { for (unsigned i = 0; i < m.size1(); ++i) for (unsigned j = i; j < m.size2(); ++ j) m(i, j) = calculate(i, j); }[/cpp]I would like to use tbb::parallel_for, but don't know how to properly use the tbb::blocked_range for this problem. The simplest solution is to just use tbb::blocked_range2d and do nothing when i > j, but that does not seem right.

Any ideas how to do this better?

Link Copied

4 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Serial code:

[bash]xEnd = m.size1() * m.size2() / 2; for(x=0; x < xEnd; ++x) { iEnd = m.size1(); y = x; for(i=0; y >= iEnd; ++i, y -= iEnd, iENd -= 1) continue; j = i + y; m(i, j) = calculate(i, j); }

Then parallelize the for(x loop

The slices of the for(x loop will have ~equal number of calls to calculate

Finding the i and j indexes will add overhead, however this technique balance the load amongs the threads in your team.

Jim Dempsey

[/bash]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

an alternate way is to establish a thread team, with member numbers, then have the teams pick there member number of the mod of the number of members.

// OpenMP way

#pragma omp parallel

{

int nThreads = omp_get_num_threads();

int iThread = omp_get_thread_num();

int pick = 0;

for(int i = 0; i < n; ++i)

for(int j = i; j < n; ++j)

{

if(pick++%nThreads == iThread)

doWork(i,j);

}

}

I will let you rework that into TBB-speak using parallel_for_each or parallel_do

In QuickThread the above becomes

parallel_distribute(

[&](int iThread, int nThreads)

{

int pick = 0;

for(int i = 0; i < n; ++i)

for(int j = i; j < n; ++j)

{

if(pick++%nThreads == iThread)

doWork(i,j);

}

});

Jim Dempsey

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page