- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is it possible to change the number of threads that IPP uses dynamically (on a call by call basis)?
Background:
In the same process, we are making lots of calls to IPP arithmetic calls (add, subtract), as well as a call to ippiConvValid_32f_C1R.
Each iteration, the add/subtract calls are called a LOT and used on relatively small data sizes (say 8,000 or less values), and the convolution routine is called once on a very large data size (say 8,000,000 values).
These operations occur separately and there should be no thread contention in between them.
With ippSetNumThreads set to 1 the adds and subtracts perform very fast, but the convolution is slow.
With ippSetNumThreads set to the core amount the adds and subtracts perform very slow but the convolution is fast.
Our guess at this point is that when there are existing threads, IPP is threading out these small adds and subtracts, which is not effective for their sizes. Additionally, we would prefer to do the threading logic ourselves for these calls. However, we do not want to do the threading logic for the convolution call.
Background:
In the same process, we are making lots of calls to IPP arithmetic calls (add, subtract), as well as a call to ippiConvValid_32f_C1R.
Each iteration, the add/subtract calls are called a LOT and used on relatively small data sizes (say 8,000 or less values), and the convolution routine is called once on a very large data size (say 8,000,000 values).
These operations occur separately and there should be no thread contention in between them.
With ippSetNumThreads set to 1 the adds and subtracts perform very fast, but the convolution is slow.
With ippSetNumThreads set to the core amount the adds and subtracts perform very slow but the convolution is fast.
Our guess at this point is that when there are existing threads, IPP is threading out these small adds and subtracts, which is not effective for their sizes. Additionally, we would prefer to do the threading logic ourselves for these calls. However, we do not want to do the threading logic for the convolution call.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
IPP do not prevent you to call ippSetNumThreads several times with diffferent values so it should be possible to change number of theads on by call basis.
Regards,
Vladimir
IPP do not prevent you to call ippSetNumThreads several times with diffferent values so it should be possible to change number of theads on by call basis.
Regards,
Vladimir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - gbraytx
Is it possible to change the number of threads that IPP uses dynamically (on a call by call basis)?
Background:
In the same process, we are making lots of calls to IPP arithmetic calls (add, subtract), as well as a call to ippiConvValid_32f_C1R.
Each iteration, the add/subtract calls are called a LOT and used on relatively small data sizes (say 8,000 or less values), and the convolution routine is called once on a very large data size (say 8,000,000 values).
These operations occur separately and there should be no thread contention in between them.
With ippSetNumThreads set to 1 the adds and subtracts perform very fast, but the convolution is slow.
With ippSetNumThreads set to the core amount the adds and subtracts perform very slow but the convolution is fast.
Our guess at this point is that when there are existing threads, IPP is threading out these small adds and subtracts, which is not effective for their sizes. Additionally, we would prefer to do the threading logic ourselves for these calls. However, we do not want to do the threading logic for the convolution call.
Background:
In the same process, we are making lots of calls to IPP arithmetic calls (add, subtract), as well as a call to ippiConvValid_32f_C1R.
Each iteration, the add/subtract calls are called a LOT and used on relatively small data sizes (say 8,000 or less values), and the convolution routine is called once on a very large data size (say 8,000,000 values).
These operations occur separately and there should be no thread contention in between them.
With ippSetNumThreads set to 1 the adds and subtracts perform very fast, but the convolution is slow.
With ippSetNumThreads set to the core amount the adds and subtracts perform very slow but the convolution is fast.
Our guess at this point is that when there are existing threads, IPP is threading out these small adds and subtracts, which is not effective for their sizes. Additionally, we would prefer to do the threading logic ourselves for these calls. However, we do not want to do the threading logic for the convolution call.
// to exploit the maximum of CPU power, please try to do smth like following
ippSetNumThreads( 1 );
ippiAdd_8u_C1RSfs(..)
int numCore = ippGetNumCoresOnDie();
ippSetNumThreads(numCore);
ippiConvValid_32f_C1R()
//and back again
ippSetNumThreads( 1 );
ippiAdd_8u_C1RSfs(..)

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page