Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
2421 Discussions

## Help in 2D Array

Beginner
168 Views

Hi all,

Just started my way into TBB. I have 2 question, and I need your help on it.

*1st: How to parallelize 2D array? The serial code that i want to parallelize is:

for(i=0;i<8;i++)
{
for(j=0;j<6;j++)
{
a=xor1[k++];
}
}

*2nd: If the code above is located in a class, how should I construct the parallel class ? will "Struct" work inside a class ?

Waiting to hear from you

Thanks & Regards

Munzer

6 Replies
Black Belt
168 Views
Unfortunately parallelisation doesn't come free, and you should always count with some overhead. The relevant concept is granularity: work should be distributed in parcels that are small enough to make a difference, yet large enough to compensate for the overhead. In this case, there's nothing to be gained from high-level parallelisation with TBB.

If the arrays were significantly larger, parallel_for and blocked_range2d would be your tools.

I suggest reading through the tuturial before going any further, and perhaps pick up a book about it.

Beginner
168 Views

I'm with that point you mentioned. But this for loops will be executed more than 100,000 times. Can you write for me the TBB code for the example above? assuming that the array and the loop ranges is 10000 insted of 6 or 8 previously

Employee
168 Views

Blocked_range2d and parallel_for should be used for this example. Detailed information and examples how to use them can be found in TBB documentation (Reference.pdf and etc.).

To parallelize the code above lets create new class and name it ParallelClass; operator()() of the class looks like the following:

void operator()(const blocked_range2d& r) const {

int (*a) = my_a;

int (*xor1) = my_xor1;

for (size_t i = r.rows().begin(); i!=r.rows().end(); i++)

for (size_t j = r.cols().begin(); j!=r.cols().end(); j++) {

a = xor1[i*N + j + _s];

}

}

Note that i*N + j + _s index is used instead of k one above. The _s variable is added for case if k had some non-zero initial value.

Then use parallel_for():

parallel_for(blocked_range2d(0, M, 0, N), ParallelClass(a, xor1), auto_partitioner());

Black Belt
168 Views
I would like to amend my suggestion about using blocked_range2d: you probably don't need it unless, e.g., you want to show off (or demo TBB), or your biggest dimension is still too small for good load distribution (the original values might very well merit TBB if the work on each element were significant).
Beginner
168 Views

Thanks. This is what I was looking for. I'm doing this because I'm still learning about TBB and how to use it. I may have few questions in the future for you.

Again Thank you.

Beginner
168 Views

Hi,

Thanks for your reply. It helped me to understant alot. Mind to ask you more. How can I return back the 2D Array after parallel_for to main code?

I have tried to do that, but unluckly I couldnt manage to return the array and accessing a wrong memory location. The source per below:

const size_t M = 8;

const size_t N = 6;

class pSubstitutionClass {

public:

int (*my_a);

int (*my_xor1);

void operator()(const blocked_range2d& r) const {

int (*a) = my_a;

int (*xor1) = my_xor1;

for (size_t i = r.rows().begin(); i!=r.rows().end(); i++)

for (size_t j = r.cols().begin(); j!=r.cols().end(); j++) {

a = xor1[i*N + j];

}

}

pSubstitutionClass( int (*a), int xor1[M*N]) :

my_a(a), my_xor1(xor1)

{}

};

int(*(ParallelSubstitution)( int (*a), int xor1[M*N])){

pSubstitutionClass pSub(a,xor1);

parallel_for(blocked_range2d(0, M, 0, N), pSub, auto_partitioner());

return pSub.my_a;

}