Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.

Field copy with Alpha

BatterseaSteve
Beginner
373 Views

Hi 

I have a question

I have a source buffer of interlaced, interleaved YUV with alpha

i.e. UYAVYA - 1920x1080

I need to strip the alpha, extract one field, resize it to half and convert to 420planer

as a single buffer (i.e not 3 separate planes but a single buffer of concatenated Y,U,V

Can anyone recommend the fastest set of methods to do this? 

Cheers

Steve

0 Kudos
13 Replies
BatterseaSteve
Beginner
373 Views

The main problem I have with this is actually the alpha strip. i.e. getting the UYAVYA into UYVY.

I tried the following:

IppiSize uyvyRoi = {1,1920*1080};

this->uyvyBuffer = ippsMalloc_8u(1920*1080*sizeof(Ipp8u)*2);

ippiCopy_16u_C1R((Ipp8u *)uyavyaBuffer, 3, this->uyvyBuffer, 2, uyvyRoi);

but this is incredibly slow (15msec!)

I am struggling to find anything else that will strip a single channel from a 3 channel image, or copy a 3 channel image to a 2 channel image.

Any help appreciated.

Steve

Sergey_K_Intel
Employee
373 Views

Hi Steve,

For stripping alpha channel values you can probably use ippiSwapChannels_8u_C4C3R function. There's dstOrder[3] array, specifying the order of output channels.

Regards,
Sergey 

Igor_A_Intel
Employee
373 Views

Hi Steve,

as finally you need planar format - you can try to use ippiCopy_8u_C4C1R 3 times - for Y, U and V

regards, Igor

BatterseaSteve
Beginner
373 Views

Hi Sergey

Thanks for responding. As far as I can tell, the ippiSwapChannels_8u_C4C3R will only swap 4 component to 3 component - I need 3 component to 2 component.

Cheers

Steve

BatterseaSteve
Beginner
373 Views

OK

I have most of this worked out - except the alpha strip

As I said above, i can use ippiCopy_16u_C1R - but this is taking a long time - 15msecs

Can anyone tell me why?

Cheers

SergeyKostrov
Valued Contributor II
373 Views
>>As I said above, i can use ippiCopy_16u_C1R - but this is taking a long time - 15msecs >> >>Can anyone tell me why? You didn't tell us anything about hardware, that is CPU, and my question is do you use a right CPU Dispatching DLL for Image Processing domain?
BatterseaSteve
Beginner
373 Views

Hi Sergey

Thanks for replying 

CPU is E2660, machine is z820 dual CPU 32 cores. So no slouch! Running g windows 7 64 bi

Compiled on vs2010 64 bit

As far as I know I am using auto dispatching. 

I will check though. 

SergeyKostrov
Valued Contributor II
373 Views
Take a look at Intel® Integrated Performance Primitives for Windows* OS User's Guide: ... Page 19 Intel® Integrated Performance Primitives Theory of Operation Dispatching ... There is a table Identification of Codes Associated with Processor-Specific Libraries.
BatterseaSteve
Beginner
373 Views

Hi Sergey

I run ippInit since I am using static Libs

I check the cpuType from ippGetCpuType and it returns ippCpuAVX which is correct for my CPU I think.

Cheers

Steve

BatterseaSteve
Beginner
373 Views

Hi Digging a bit deeper reveals the problem to be due to the stepSize I am using.

If I run:

ippiSize uyvyRoi = {1,1920*1080};

uyavyaBuffer = ippsMalloc_8u(1920*1080*sizeof(Ipp8u)*3);

uyvyBuffer = ippsMalloc_8u(1920*1080*sizeof(Ipp8u)*2);

ippiCopy_16u_C1R((Ipp8u *)uyavyaBuffer, 3, uyvyBuffer, 2, uyvyRoi);

I get anything from 7-15msecs.

If I run:

ippiSize uyvyRoi = {1920,1080};

uyavyaBuffer = ippsMalloc_8u(1920*1080*sizeof(Ipp8u)*3);

uyvyBuffer = ippsMalloc_8u(1920*1080*sizeof(Ipp8u)*2);

ippiCopy_16u_C1R((Ipp8u *)uyavyaBuffer, 1920*2, uyvyBuffer, 1920*2, uyvyRoi);

I get 0.2msecs......

So the point is that trying to strip the alpha by treating the source buffer as a 3x(1920x1080) buffer and the dest as a 2x(1920x1080) buffer is bad. Which puts me back to square one...... Anyone got any other ideas - 'cause I can write a c loop thats quicker.

Cheers Steve

Igor_A_Intel
Employee
373 Views

Hi Steve,

as I've already answered - you should use ippiCopy_8u_C3C1R 2 times (as finaly you need planar format) - this F extracts any channel you need from C3 image (so C4C1 - the same but for C4 image)

regards, Igor

BatterseaSteve
Beginner
373 Views

Hi Igor

Thanks for replying and sorry - I did see your comment before and thought I had replied.

The problem with ippiCopy_8u_C3C1R is that it extracts the UYA VYA image into Y's and UV's.

I tried this as you suggested but I need the Y's the U's and the V's in separate buffers.

So the question remains as to how I extract the UV's into separate U and V buffers?

I have looked at the format conversions to see if there was one that would take NV12 422 and convert to YUV420p but I could not see one.

I've been on this a week now and am now delving into trying to use SSE!

Any thoughts gratefully received?

Cheers

Steve

BatterseaSteve
Beginner
373 Views

Hi

I solved this using a bit of bespoke SSE to split the UV buffer into separate U and V buffers.

Total time seems acceptable.

Cheers

Steve

Reply