Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

Help for int_sin.c with TBB

Akio_Yasu__Intel_
181 Views
Hello,

I am trying to change the sample code "int_sin.c" using TBB, but it is not going well.
I gets the wrong answer as output which should be 4 but isn't.
I attach the source code on this issue.
I really appreciate if you could check it and correct it.

Thanks,
Akio
0 Kudos
7 Replies
Alexey_K_Intel3
Employee
181 Views
You did not initialize IntSin::step in the splitting constructor. I guess this is the reason of the problem.

Akio_Yasu__Intel_
181 Views
You did not initialize IntSin::step in the splitting constructor. I guess this is the reason of the problem.


Hello Alexey,

Thanks for your advice.
But I am very beginner in C++ and leaning TBB with Intel TBB book from O'reilly.
It seems taking a time for me to get the quick answer to find out how to initialize "IntSin::step".
I really appreciate if you simply give me the answer correcting the source code I attached.
I am so sorry bothering you in this case.

Best,
Akio
robert-reed
Valued Contributor II
181 Views
Thanks for your advice.
But I am very beginner in C++ and leaning TBB with Intel TBB book from O'reilly.
It seems taking a time for me to get the quick answer to find out how to initialize "IntSin::step".

Take a look at the TBB Tutorial, the advanced example of parallel reduce. Note the splitting constructor:

[cpp]MinIndexFoo( MinIndexFoo& x, split ) :
   my_a(x.my_a),
   value_of_min(FLT_MAX), // FLT_MAX from 
   index_of_min(-1)
{}[/cpp]

This is another case where the splitting constructor needs to initialize more than one thing. See how my_a is being mapped from the functor object labeled x to the new object. You could do the same thing with step.

Akio_Yasu__Intel_
181 Views
Hi Robert and Alexey

I think I could correct the problem by looking into the initialization where both of you indicated.
Thank you so much, I could progress one step.

Here is another problem about performance.
I can not get the better performace comparing to the one written with OpenMP on the same code.

My command line operation is like below:
> icl int_sin_tbb.c tbb.lib /MD
> int_sin_tbb.exe
Application Clocks = 2.875000e+003

> icl int_sin_omp.c /Qopenmp
> int_sin_omp.exe
Application Clocks = 9.210000e+002

the OpenMP is around three times faster than TBB.
I am using 11.0.072 Intel C++ Compiler and there are vectrized and parallelized messages from the compiler for OpenMP build.

I attach the both samples, please take a look at them and let me know any possible reason on this performance difference.

Thank you,
Akio
Alexey_K_Intel3
Employee
181 Views
Take a look at the blog entry about a similar problem I wrote before:
http://software.intel.com/en-us/blogs/2008/03/04/why-a-simple-test-can-get-parallel-slowdown/

I guess you might find some answers there.
Akio_Yasu__Intel_
181 Views
Take a look at the blog entry about a similar problem I wrote before:
http://software.intel.com/en-us/blogs/2008/03/04/why-a-simple-test-can-get-parallel-slowdown/

I guess you might find some answers there.

Hello Alexey,

I have read your article and tried to change the code using local variable in operator() as below but it did not help.


class IntSin {
const double step;
public:
double sum;
void operator()( const blocked_range& r ) {
double x_i;
double local_sum=0;
double step = IntSin::step;
for( size_t i=r.begin(); i!=r.end(); ++i ) {
x_i = i * step;
local_sum += INTEG_FUNC(x_i) * step;
}
sum += local_sum;
}
// IntSin (IntSin& x, split) : x_i(0), step(x.step), sum(0) {}
IntSin (IntSin& x, split) : step(x.step), sum(0) {}
void join( const IntSin& y) {sum+=y.sum;}
IntSin (double _step) : step(_step), sum(0) {}
};

Do you have any idea or insight why the code does not run faster?
I appreciate your help.

Regards,
Akio
Alexey_K_Intel3
Employee
181 Views
I have experimented with your code, and found that the biggest performance impact is due to the use of /MD option required by TBB. For some unknown reason (which I would call a bug in Intel Compiler's math library), just switching from /MT to /MD slowed down your test three times, no matter whether TBB was used, or OpenMP, or no threading at all.
Reply