Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
12 Views

IPP Threading: Possible to use different num threads for calls in same process?

Is it possible to change the number of threads that IPP uses dynamically (on a call by call basis)?

Background:
In the same process, we are making lots of calls to IPP arithmetic calls (add, subtract), as well as a call to ippiConvValid_32f_C1R.

Each iteration, the add/subtract calls are called a LOT and used on relatively small data sizes (say 8,000 or less values), and the convolution routine is called once on a very large data size (say 8,000,000 values).

These operations occur separately and there should be no thread contention in between them.

With ippSetNumThreads set to 1 the adds and subtracts perform very fast, but the convolution is slow.
With ippSetNumThreads set to the core amount the adds and subtracts perform very slow but the convolution is fast.

Our guess at this point is that when there are existing threads, IPP is threading out these small adds and subtracts, which is not effective for their sizes. Additionally, we would prefer to do the threading logic ourselves for these calls. However, we do not want to do the threading logic for the convolution call.

0 Kudos
2 Replies
Highlighted
12 Views

Hello,

IPP do not prevent you to call ippSetNumThreads several times with diffferent values so it should be possible to change number of theads on by call basis.

Regards,
Vladimir
0 Kudos
Highlighted
Moderator
12 Views

Quoting - gbraytx
Is it possible to change the number of threads that IPP uses dynamically (on a call by call basis)?

Background:
In the same process, we are making lots of calls to IPP arithmetic calls (add, subtract), as well as a call to ippiConvValid_32f_C1R.

Each iteration, the add/subtract calls are called a LOT and used on relatively small data sizes (say 8,000 or less values), and the convolution routine is called once on a very large data size (say 8,000,000 values).

These operations occur separately and there should be no thread contention in between them.

With ippSetNumThreads set to 1 the adds and subtracts perform very fast, but the convolution is slow.
With ippSetNumThreads set to the core amount the adds and subtracts perform very slow but the convolution is fast.

Our guess at this point is that when there are existing threads, IPP is threading out these small adds and subtracts, which is not effective for their sizes. Additionally, we would prefer to do the threading logic ourselves for these calls. However, we do not want to do the threading logic for the convolution call.


// to exploit the maximum of CPU power, please try to do smth like following
ippSetNumThreads( 1 );
ippiAdd_8u_C1RSfs(..)

int numCore = ippGetNumCoresOnDie();
ippSetNumThreads(numCore);
ippiConvValid_32f_C1R()

//and back again
ippSetNumThreads( 1 );
ippiAdd_8u_C1RSfs(..)
0 Kudos