Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

why my tbb program doesn't speed up


I have this part of code in my program:

tbb::parallel_for(tbb::blocked_range<int>(0, NumberY,NumberY/6 ), [&] (const tbb::blocked_range<int> &r) -> void{
              for (int iy=r.begin(); iy<r.end(); iy++){
                 int x_loc = x_left;
                for (int ix=0; ix<NumberX; ix++){
                        MyFunction(x_loc, intensity_value);
                        pfDensity[iy*NumberX+ix]  += intensity_value * mvp_idx;
                        x_loc += delta_x;
 } ); // parallel_for

NumberY and NumberX are around 8000. When I run in single thread, it runs 2 times faster than running in multithreading using TBB. I have tried to adjust grain size, or init tbb first. none of them helps.

This is a function apply to a matix, MyFunction is an interpolate function. I think it doesn't speed up as the load is too small. But this is as much as I can divide the work and I still would like something running faster than the single thread code.

How can I improve the performance here?


0 Kudos
0 Replies