- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
7 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You did not initialize IntSin::step in the splitting constructor. I guess this is the reason of the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Alexey Kukanov (Intel)
You did not initialize IntSin::step in the splitting constructor. I guess this is the reason of the problem.
Hello Alexey,
Thanks for your advice.
But I am very beginner in C++ and leaning TBB with Intel TBB book from O'reilly.
It seems taking a time for me to get the quick answer to find out how to initialize "IntSin::step".
I really appreciate if you simply give me the answer correcting the source code I attached.
I am so sorry bothering you in this case.
Best,
Akio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Akio Yasu (Intel)
Thanks for your advice.
But I am very beginner in C++ and leaning TBB with Intel TBB book from O'reilly.
It seems taking a time for me to get the quick answer to find out how to initialize "IntSin::step".
But I am very beginner in C++ and leaning TBB with Intel TBB book from O'reilly.
It seems taking a time for me to get the quick answer to find out how to initialize "IntSin::step".
Take a look at the TBB Tutorial, the advanced example of parallel reduce. Note the splitting constructor:
[cpp]MinIndexFoo( MinIndexFoo& x, split ) : my_a(x.my_a), value_of_min(FLT_MAX), // FLT_MAX fromindex_of_min(-1) {}[/cpp]
This is another case where the splitting constructor needs to initialize more than one thing. See how my_a is being mapped from the functor object labeled x to the new object. You could do the same thing with step.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Robert and Alexey
I think I could correct the problem by looking into the initialization where both of you indicated.
Thank you so much, I could progress one step.
Here is another problem about performance.
I can not get the better performace comparing to the one written with OpenMP on the same code.
My command line operation is like below:
> icl int_sin_tbb.c tbb.lib /MD
> int_sin_tbb.exe
Application Clocks = 2.875000e+003
> icl int_sin_omp.c /Qopenmp
> int_sin_omp.exe
Application Clocks = 9.210000e+002
the OpenMP is around three times faster than TBB.
I am using 11.0.072 Intel C++ Compiler and there are vectrized and parallelized messages from the compiler for OpenMP build.
I attach the both samples, please take a look at them and let me know any possible reason on this performance difference.
Thank you,
Akio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Take a look at the blog entry about a similar problem I wrote before:
http://software.intel.com/en-us/blogs/2008/03/04/why-a-simple-test-can-get-parallel-slowdown/
I guess you might find some answers there.
http://software.intel.com/en-us/blogs/2008/03/04/why-a-simple-test-can-get-parallel-slowdown/
I guess you might find some answers there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Alexey Kukanov (Intel)
Take a look at the blog entry about a similar problem I wrote before:
http://software.intel.com/en-us/blogs/2008/03/04/why-a-simple-test-can-get-parallel-slowdown/
I guess you might find some answers there.
http://software.intel.com/en-us/blogs/2008/03/04/why-a-simple-test-can-get-parallel-slowdown/
I guess you might find some answers there.
Hello Alexey,
I have read your article and tried to change the code using local variable in operator() as below but it did not help.
class IntSin {
const double step;
public:
double sum;
void operator()( const blocked_range
double x_i;
double local_sum=0;
double step = IntSin::step;
for( size_t i=r.begin(); i!=r.end(); ++i ) {
x_i = i * step;
local_sum += INTEG_FUNC(x_i) * step;
}
sum += local_sum;
}
// IntSin (IntSin& x, split) : x_i(0), step(x.step), sum(0) {}
IntSin (IntSin& x, split) : step(x.step), sum(0) {}
void join( const IntSin& y) {sum+=y.sum;}
IntSin (double _step) : step(_step), sum(0) {}
};
Do you have any idea or insight why the code does not run faster?
I appreciate your help.
Regards,
Akio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have experimented with your code, and found that the biggest performance impact is due to the use of /MD option required by TBB. For some unknown reason (which I would call a bug in Intel Compiler's math library), just switching from /MT to /MD slowed down your test three times, no matter whether TBB was used, or OpenMP, or no threading at all.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page