- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Attached is a simple console project that shows that when scaling a matrix ippiScaleC_32s32f_C1R is slower than a simple equivalent C++ code loop. In this example a column from the source matrix is scaled into an output vector. On my PC with i7-7700K the C++ loop is about 20% faster.
Is there any way to improve the ippiScaleC performance?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Andrey, how could we check this case? There is no reproducer attached to this thread.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry I thought I attached the zip. Here it is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, it seems there is a problem on the IPP side and this function has to be more optimized. We will escalate the case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andreypir!
The IPP works better with rectangular ROIs when loads whole SIMD register.
But could you please replace in your code this
for (int n = 0; n < NTESTS_SCALE; n++)
{
ScaleWithIPP(Source, nColumns, Dest, nRows, Factor, Shift);
}
with the next code? I see some speedup at my 64bit Skylake system.
for (int n = 0; n < NTESTS_SCALE; n++)
{
int dLen = 0;
int phase = 0;
ippsSampleDown_32f((Ipp32f*)Source, nColumns*nRows, Dest, &dLen, nColumns, &phase);
IppiSize roiSize = { nRows, 1 };
ippiScaleC_32s32f_C1R((Ipp32s*)Dest, nRows * sizeof(__int32), Factor, Shift, Dest, sizeof(float), roiSize, ippAlgHintFast);
}
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andrey,
Thank you. Yes, downsampling then scaling is substantially faster than just scaling, and somewhat faster than a loop:
ScaleWithLoop: 828
ScaleWithIPP 625
on my computer. I think I hoped for a better gain, but this will work.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page