Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Help in 2D Array

m_enayah1
Beginner
802 Views

Hi all,

Just started my way into TBB. I have 2 question, and I need your help on it.

*1st: How to parallelize 2D array? The serial code that i want to parallelize is:

for(i=0;i<8;i++)
{
for(j=0;j<6;j++)
{
a=xor1[k++];
}
}

*2nd: If the code above is located in a class, how should I construct the parallel class ? will "Struct" work inside a class ?

Waiting to hear from you

Thanks & Regards

Munzer

0 Kudos
6 Replies
RafSchietekat
Valued Contributor III
802 Views
Unfortunately parallelisation doesn't come free, and you should always count with some overhead. The relevant concept is granularity: work should be distributed in parcels that are small enough to make a difference, yet large enough to compensate for the overhead. In this case, there's nothing to be gained from high-level parallelisation with TBB.

If the arrays were significantly larger, parallel_for and blocked_range2d would be your tools.

I suggest reading through the tuturial before going any further, and perhaps pick up a book about it.

0 Kudos
m_enayah1
Beginner
802 Views

I'm with that point you mentioned. But this for loops will be executed more than 100,000 times. Can you write for me the TBB code for the example above? assuming that the array and the loop ranges is 10000 insted of 6 or 8 previously

0 Kudos
Elena_G_Intel
Employee
802 Views

Blocked_range2d and parallel_for should be used for this example. Detailed information and examples how to use them can be found in TBB documentation (Reference.pdf and etc.).

To parallelize the code above lets create new class and name it ParallelClass; operator()() of the class looks like the following:

void operator()(const blocked_range2d& r) const {

int (*a) = my_a;

int (*xor1) = my_xor1;

for (size_t i = r.rows().begin(); i!=r.rows().end(); i++)

for (size_t j = r.cols().begin(); j!=r.cols().end(); j++) {

a = xor1[i*N + j + _s];

}

}

Note that i*N + j + _s index is used instead of k one above. The _s variable is added for case if k had some non-zero initial value.

Then use parallel_for():

parallel_for(blocked_range2d(0, M, 0, N), ParallelClass(a, xor1), auto_partitioner());

0 Kudos
RafSchietekat
Valued Contributor III
802 Views
I would like to amend my suggestion about using blocked_range2d: you probably don't need it unless, e.g., you want to show off (or demo TBB), or your biggest dimension is still too small for good load distribution (the original values might very well merit TBB if the work on each element were significant).
0 Kudos
m_enayah1
Beginner
802 Views

Thanks. This is what I was looking for. I'm doing this because I'm still learning about TBB and how to use it. I may have few questions in the future for you.

Again Thank you.

0 Kudos
m_enayah1
Beginner
802 Views

Hi,

Thanks for your reply. It helped me to understant alot. Mind to ask you more. How can I return back the 2D Array after parallel_for to main code?

I have tried to do that, but unluckly I couldnt manage to return the array and accessing a wrong memory location. The source per below:

const size_t M = 8;

const size_t N = 6;

class pSubstitutionClass {

public:

int (*my_a);

int (*my_xor1);

void operator()(const blocked_range2d& r) const {

int (*a) = my_a;

int (*xor1) = my_xor1;

for (size_t i = r.rows().begin(); i!=r.rows().end(); i++)

for (size_t j = r.cols().begin(); j!=r.cols().end(); j++) {

a = xor1[i*N + j];

}

}

pSubstitutionClass( int (*a), int xor1[M*N]) :

my_a(a), my_xor1(xor1)

{}

};

int(*(ParallelSubstitution)( int (*a), int xor1[M*N])){

pSubstitutionClass pSub(a,xor1);

parallel_for(blocked_range2d(0, M, 0, N), pSub, auto_partitioner());

return pSub.my_a;

}

0 Kudos
Reply