Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Situation Analysis-Should I use TBB?

jobin007007
Beginner
336 Views
Imagine a class as describe as below.

Data{
Function1();
Function2();
};

Data S[12];

main()
{
for(int i=0;i<12;i++)
{
s.Function1();
s.Function2();
}
}

I am currently implementing main() as show below-

main()
{
for(int i=0;i<12;i++)
{
Thread(s.Function1());
Thread(s.Function2());
}
}


Imagine that Function1 and Function2 are really resource intensive and involve a lot of computations and are completely independent of each other. I already divided my program into 12 or 24 different components by calling each object function call as a new thread. I am using a 8 Core processor. So since the threads are divided into 12 or 24 already, would there be any advantage in using TBB?

Please advice. I am not an expert on threads and would like to know your views, comments.

0 Kudos
5 Replies
robert_jay_gould
Beginner
336 Views
Quoting - jobin007007
Imagine a class as describe as below.

Data{
Function1();
Function2();
};

Data S[12];

main()
{
for(int i=0;i<12;i++)
{
s.Function1();
s.Function2();
}
}

I am currently implementing main() as show below-

main()
{
for(int i=0;i<12;i++)
{
Thread(s.Function1());
Thread(s.Function2());
}
}


Imagine that Function1 and Function2 are really resource intensive and involve a lot of computations and are completely independent of each other. I already divided my program into 12 or 24 different components by calling each object function call as a new thread. I am using a 8 Core processor. So since the threads are divided into 12 or 24 already, would there be any advantage in using TBB?

Please advice. I am not an expert on threads and would like to know your views, comments.


do Functions 1&2 do any IO operations (read disks, network, database, etc...)?
If they do use IO, TBB probably won't give you much benefit for the cost of refactoring your code base and increasing dependencies, but if your functions just do lots of calculations, it might be worth trying out TBB.

One simple point to notice is your creating and destroying 2 threads for each iteration of your loop, so you're wasting tons of resources. You should at least keep a thread pool and reuse those, even if you're doing IO.

0 Kudos
jobin007007
Beginner
336 Views

do Functions 1&2 do any IO operations (read disks, network, database, etc...)?
If they do use IO, TBB probably won't give you much benefit for the cost of refactoring your code base and increasing dependencies, but if your functions just do lots of calculations, it might be worth trying out TBB.

One simple point to notice is your creating and destroying 2 threads for each iteration of your loop, so you're wasting tons of resources. You should at least keep a thread pool and reuse those, even if you're doing IO.


The threads are not getting destroyed after each iteration. More Threads are getting created until the no of simultaneous running threads=24 and all of them run parallely at the same time. Thats what i meant.
0 Kudos
jobin007007
Beginner
336 Views

do Functions 1&2 do any IO operations (read disks, network, database, etc...)?
If they do use IO, TBB probably won't give you much benefit for the cost of refactoring your code base and increasing dependencies, but if your functions just do lots of calculations, it might be worth trying out TBB.

One simple point to notice is your creating and destroying 2 threads for each iteration of your loop, so you're wasting tons of resources. You should at least keep a thread pool and reuse those, even if you're doing IO.


One function does do I/O operations . There are also a lot of computations involved. But my theory is that arent there 12 threads already...and the number of processors is 8. So therotically shouldnt each thread run at maximum capacity.

Like...
Maximum no of processors=8;
Capacity of one processor=N

If i use one thread...Total work=N
If i use two threaads..Total work=2N....
and so on..
If i use 8 threads...Total work=8N..

Does it work it like that to some extend?



0 Kudos
robert_jay_gould
Beginner
336 Views
Quoting - jobin007007

One function does do I/O operations . There are also a lot of computations involved. But my theory is that arent there 12 threads already...and the number of processors is 8. So therotically shouldnt each thread run at maximum capacity.

Like...
Maximum no of processors=8;
Capacity of one processor=N

If i use one thread...Total work=N
If i use two threaads..Total work=2N....
and so on..
If i use 8 threads...Total work=8N..

Does it work it like that to some extend?




Yes, and Not at all :)

Simply put, the maximum speedup you can obtain on an 8core machine (without doing smart tricks) is x8 that of a 1core machine. You probably know that already, and that's what TBB (and OMP) will be able to squeeze out if you've done your part AND you AREN'T using IO operations (although in practice it's not trivial to get that x8 performance anyways).

Once IO comes into play the story is totally different, and much more complicated

Now a quick solution to your problem would be to create a thread for each IO operation, but use a thread pool for the non-IO operation

kind of like (totally uncompilable pseudocode):

//IO function
for(i=0;i<12;++i)
{
thread(function1());
}

//Non IO function
tbb_for(work); //uses a thread pool behind the scenes
//or
#pragma omp for//uses a thread pool behind the scenes
for(i=0;i<12;++i)
{
function2();
}


However an important lesson to take away in threading is that if you have more threads than processors, your performance will begin to degrade. So using 24 threads is normally worse than 8 threads (because in theory threads will context switch relatively more than they get to work, and that is a fairly heavy overhead, and in practice it is much much worse because of thrashing).


0 Kudos
jimdempseyatthecove
Honored Contributor III
336 Views

Try OpenMP in this manner

[cpp]    int i;
    #pragma omp parallel
    {
        #pragma omp for schedule(static, 1) nowait
        for (i = 0; i < 12; i++)
            s.Function1();
  
        #pragma omp for schedule(static, 1) nowait
        for (i = 0; i < 12; i++)
            s.Function2();
    }
[/cpp]

Also consider enabling OpenMP nested parallelism to permit your functions to contain parallel regions.

Jim Dempsey
0 Kudos
Reply