Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

ippirotate performance

Lamp
Beginner
1,072 Views
hi,

for some reason, i have to use ippirotate_8uC3R for a 4008 * 2672 RGB image, but this function will take more than 1 second to be processed...it's crazy.....

the CPU we are using is i7-2600, since ippirotate is not threaded, I guess it will run only on one Core.

is there any way to make it faster? or replace itby other functions?

0 Kudos
1 Solution
SergeyKostrov
Valued Contributor II
1,072 Views
It would be nice to see a test case. Could you provide it?

I think that performance impact is a combination of the following:

- 3 planes of4008x2672 each;

- some impact from performance of 'sin' and 'cos' functions used internallyby IPP, when angles are not
90, 180 or 270 degrees.You know how a new pixel's position is calculated: wx = x*cosA - y*sinA and
wy = x*sinA + y*cosA. So, you have 4 multiplicationsand2 adds/subs. In total, there are more than 192
million of FP operations to rotate an image for5 degrees, for example;

- a performance of aninterpolation algorithm.

I see more and more caseslike yours since sizes of imageshavedramatically increased.Nikon
recently announced a 36MP DSLR camera.Nokia announced a41MP smartphone. I don't know if Intel
Software Engineers evertested IPP with images like I mentioned above. I'm personally interested in
processing of32Kx32K images and larger.

By the way, I saw a singnificantly worse performance from Microsoft's ImagingAPI. It is simply impossible
to use!

Best regards,
Sergey

View solution in original post

0 Kudos
11 Replies
Lamp
Beginner
1,072 Views

it looks like differentinterpolation mode will effect the time a lot,

test on my laptop

IPPI_INTER_CUBIC 2.4 s
IPPI_INTER_LINEAR 0.7 s
IPPI_INTER_NN0.2 s

is there a big difference for thequality of returned image?

0 Kudos
Thomas_Jensen1
Beginner
1,072 Views
Are you doing 90 degree angles, or any angles?
0 Kudos
SergeyKostrov
Valued Contributor II
1,073 Views
It would be nice to see a test case. Could you provide it?

I think that performance impact is a combination of the following:

- 3 planes of4008x2672 each;

- some impact from performance of 'sin' and 'cos' functions used internallyby IPP, when angles are not
90, 180 or 270 degrees.You know how a new pixel's position is calculated: wx = x*cosA - y*sinA and
wy = x*sinA + y*cosA. So, you have 4 multiplicationsand2 adds/subs. In total, there are more than 192
million of FP operations to rotate an image for5 degrees, for example;

- a performance of aninterpolation algorithm.

I see more and more caseslike yours since sizes of imageshavedramatically increased.Nikon
recently announced a 36MP DSLR camera.Nokia announced a41MP smartphone. I don't know if Intel
Software Engineers evertested IPP with images like I mentioned above. I'm personally interested in
processing of32Kx32K images and larger.

By the way, I saw a singnificantly worse performance from Microsoft's ImagingAPI. It is simply impossible
to use!

Best regards,
Sergey
0 Kudos
Lamp
Beginner
1,072 Views
any angles, between -1 and 1 for most of cases.
0 Kudos
Lamp
Beginner
1,072 Views
The type "Matrix" contais the size of image and a byte array.

[cpp]Matrix^ IppBasicOperator::Rotate(Matrix^ source, double angle) { Matrix^ ans = gcnew Matrix(source->Size); IppiSize size = { source->Size.Width / source->PixelSize, source->Size.Height }; IppiRect roi = { 0, 0, source->Size.Width, size.height }; // Pin the matrix data pin_ptr pSource = &(source->GetBuffer()[0]); pin_ptr pDest = &(ans->GetBuffer()[0]); double xCenter = size.width / 2, yCenter = size.height / 2, xShift = 0, yShift = 0; ippiGetRotateShift(xCenter, yCenter, angle, &xShift, &yShift); int intep; if (angle == int::MaxValue) { // for debug test purpose intep = IPPI_INTER_NN; } else if (angle == int::MinValue) { // for debug test purpose intep = IPPI_INTER_CUBIC; } else { intep = IPPI_INTER_LINEAR; } if (source->PixelFormat == PixelFormat::Format24bppRgb) ippiRotate_8u_C3R(pSource, size, roi.width, roi, pDest, roi.width, roi, angle, xShift, yShift, intep); else ippiRotate_8u_C1R(pSource, size, size.width, roi, pDest, size.width, roi, angle, xShift, yShift, intep); return ans; }[/cpp]
0 Kudos
Thomas_Jensen1
Beginner
1,072 Views
It appears that you do not use IPP memory allocation functions.
If I'm correct, there is a speed penalty when the image buffer does not start on a 16-byte boundary, and when its scanline width in bytes is not a multiple of 16 (padding at the right).

This is because Ipp uses SSE for most operations.

Use ippiMalloc_8u_C1(w,h,Step) to allocate your source and dest buffers. Step is an output value.
0 Kudos
Thomas_Jensen1
Beginner
1,072 Views
It is not really required to use FP to rotate any angle. One can use a combination of skewing.

I do not know how Ipp implements rotation though.
0 Kudos
SergeyKostrov
Valued Contributor II
1,072 Views
Quoting Lamp
...
//Pinthematrixdata
pin_ptrpSource=&(source->GetBuffer()[0]);
pin_ptrpDest=&(ans->GetBuffer()[0]);
...

I would look at these two assignments because in both cases smart-pointers are used.

Could you try something like:

__declspec( align( 16 ) ) Byte *pSource = ...
__declspec( align( 16 ) ) Byte *pDest = ...

Best regards,
Sergey
0 Kudos
SergeyKostrov
Valued Contributor II
1,072 Views
It appears that you do not use IPP memory allocation functions...

They could be called in another method(s) of IppBasicOperator class.

I alsoagree that alignment of pointers to image datahas to be taken into consideration.

Best regards,
Sergey
0 Kudos
Lamp
Beginner
1,072 Views
Quoting Lamp
...
//Pinthematrixdata
pin_ptrpSource=&(source->GetBuffer()[0]);
pin_ptrpDest=&(ans->GetBuffer()[0]);
...

I would look at these two assignments because in both cases smart-pointers are used.

Could you try something like:

__declspec( align( 16 ) ) Byte *pSource = ...
__declspec( align( 16 ) ) Byte *pDest = ...

Best regards,
Sergey

I'll try toalign the data, but i don't know if it's possible... since it's a .NET application, have to pin it so GC will not move it in memory

regards

0 Kudos
SergeyKostrov
Valued Contributor II
1,072 Views
Quoting Lamp
...for some reason, i have to use ippirotate_8uC3R for a 4008 * 2672 RGB image, but this function will take more than 1 second to be processed...

What software did you use to measure a performance of ippiRotate_8u_C3R function?
Did you do this in a Debug or Release configuration?

Best regards,
Sergey
0 Kudos
Reply