simple image rectangle fillings

gol · ‎01-25-2009

I'm trying to find the best functions to achieve these pretty basic operations (in order to replace my own old functions - I'd like to get a little speedup):

-filling a rectangle with constant color, with blending
-additive blending with constant color offset, but in both directions (like, -5 on blue, +3 on green, -100 on red)

..but I'm not finding anything useful in IPP.

The rectangle filling would require some kind of block Mul+Add, and surely there's a Mul & Add, but wouldn't it be worth having both at once? Especially since a mul followed by an add raises the problem of precision - I'd prefer the add before the scaling.

As for the additive blending.. there's an Add, a Sub, but no bipolar version of those, for 8u that is. Here too I'd have to use 2 functions.

I also looked for lookup table functions, but I'm only finding complex ones, nothing that would simply map like with a palette (there's a palette function, but only formonochrome sources). And anyway, I don't think it'd be worth using a lookup table to do simple blending.

Last option would be to fill a line, then use alpha compositing with it as a source. But I'm wondering here: would the IPP functions like a srcStep parameter set to zero (meaning, repeating the same line over & over)?
Edit: they actually work with srcStep set to zero, however AlphaComp isn't very fast.

I have my own MMX functions for the above already, but only for 8bit RGBA, as it's much simpler. I need them for 8bit RGB, and it's pretty complex to do optimized ones. I guess it'd have to process in short lines of 48components (=3*16 =8*6 (if MMX)=16*3 (if SSE2)) to be the most efficient, but aligning all this properly.. that sounds painful & boring.

Also, are the functions IppiAddC_8u_C4IRSfs & IppiMulC_8u_C3IRSfs using SSE2's16-byte packed operationswhen available? The ScaleFactor in most of those functions would hint that 8bit versions may be done in 16bit accuracy?

Edit: to answer part of this, it looks like those functions are faster when ScaleFactor is zero. But I traced the code and couldn't find out if they were using MMX or SSE (saw nothing of those, but maybe I traced wrong).
But they look faster than my functions, so I guess I can call IppiAddC & then IppiSubC.

Vladimir_Dudnik · ‎02-09-2009

Hello,

did you notice AlphaComposition functions in IPP?

/* /////////////////////////////////////////////////////////////////////////////
// Alpha Compositing Operations
///////////////////////////////////////////////////////////////////////////// */
/*
// Contents:
// ippiAlphaPremul_8u_AC4R, ippiAlphaPremul_16u_AC4R
// ippiAlphaPremul_8u_AC4IR, ippiAlphaPremul_16u_AC4IR
// ippiAlphaPremul_8u_AP4R, ippiAlphaPremul_16u_AP4R
// ippiAlphaPremul_8u_AP4IR, ippiAlphaPremul_16u_AP4IR
// Pre-multiplies pixel values of an image by its alpha values.

// ippiAlphaPremulC_8u_AC4R, ippiAlphaPremulC_16u_AC4R
// ippiAlphaPremulC_8u_AC4IR, ippiAlphaPremulC_16u_AC4IR
// ippiAlphaPremulC_8u_AP4R, ippiAlphaPremulC_16u_AP4R
// ippiAlphaPremulC_8u_AP4IR, ippiAlphaPremulC_16u_AP4IR
// ippiAlphaPremulC_8u_C4R, ippiAlphaPremulC_16u_C4R
// ippiAlphaPremulC_8u_C4IR, ippiAlphaPremulC_16u_C4IR
// ippiAlphaPremulC_8u_C3R, ippiAlphaPremulC_16u_C3R
// ippiAlphaPremulC_8u_C3IR, ippiAlphaPremulC_16u_C3IR
// ippiAlphaPremulC_8u_C1R, ippiAlphaPremulC_16u_C1R
// ippiAlphaPremulC_8u_C1IR, ippiAlphaPremulC_16u_C1IR
// Pre-multiplies pixel values of an image by constant alpha values.
//
// ippiAlphaComp_8u_AC4R, ippiAlphaComp_16u_AC4R
// ippiAlphaComp_8u_AC1R, ippiAlphaComp_16u_AC1R
// Combines two images using alpha values of both images
//
// ippiAlphaCompC_8u_AC4R, ippiAlphaCompC_16u_AC4R
// ippiAlphaCompC_8u_AP4R, ippiAlphaCompC_16u_AP4R
// ippiAlphaCompC_8u_C4R, ippiAlphaCompC_16u_C4R
// ippiAlphaCompC_8u_C3R, ippiAlphaCompC_16u_C3R
// ippiAlphaCompC_8u_C1R, ippiAlphaCompC_16u_C1R
// Combines two images using constant alpha values
//
// Types of compositing operation (alphaType)
// OVER ippAlphaOver ippAlphaOverPremul
// IN ippAlphaIn ippAlphaInPremul
// OUT ippAlphaOut ippAlphaOutPremul
// ATOP ippAlphaATop ippAlphaATopPremul
// XOR ippAlphaXor ippAlphaXorPremul
// PLUS ippAlphaPlus ippAlphaPlusPremul
//
// Type result pixel result pixel (Premul) result alpha
// OVER aA*A+(1-aA)*aB*B A+(1-aA)*B aA+(1-aA)*aB
// IN aA*A*aB A*aB aA*aB
// OUT aA*A*(1-aB) A*(1-aB) aA*(1-aB)
// ATOP aA*A*aB+(1-aA)*aB*B A*aB+(1-aA)*B aA*aB+(1-aA)*aB
// XOR aA*A*(1-aB)+(1-aA)*aB*B A*(1-aB)+(1-aA)*B aA*(1-aB)+(1-aA)*aB
// PLUS aA*A+aB*B A+B aA+aB
// Here 1 corresponds significance VAL_MAX, multiplication is performed
// with scaling
// X * Y => (X * Y) / VAL_MAX
// and VAL_MAX is the maximum presentable pixel value:
// VAL_MAX == IPP_MAX_8U for 8u
// VAL_MAX == IPP_MAX_16U for 16u
*/

/* /////////////////////////////////////////////////////////////////////////////
// Name: ippiAlphaPremul_8u_AC4R, ippiAlphaPremul_16u_AC4R
// ippiAlphaPremul_8u_AC4IR, ippiAlphaPremul_16u_AC4IR
// ippiAlphaPremul_8u_AP4R, ippiAlphaPremul_16u_AP4R
// ippiAlphaPremul_8u_AP4IR, ippiAlphaPremul_16u_AP4IR
//
// Purpose: Pre-multiplies pixel values of an image by its alpha values
// for 4-channel images
// For channels 1-3
// dst_pixel = (src_pixel * src_alpha) / VAL_MAX
// For alpha-channel (channel 4)
// dst_alpha = src_alpha
// Parameters:
// pSrc Pointer to the source image for pixel-order data,
// array of pointers to separate source color planes for planar data
// srcStep Step through the source image
// pDst Pointer to the destination image for pixel-order data,
// array of pointers to separate destination color planes for planar data
// dstStep Step through the destination image
// pSrcDst Pointer to the source/destination image, or array of pointers
// to separate source/destination color planes for in-place functions
// srcDstStep Step through the source/destination image for in-place functions
// roiSize Size of the source and destination ROI
// Returns:
// ippStsNoErr No errors
// ippStsNullPtrErr pSrc == NULL, or pDst == NULL, or pSrcDst == NULL
// ippStsSizeErr The roiSize has a field with negative or zero value
*/

Regards,
Vladimir

gol · ‎02-10-2009

Yes, but as I wrote those are slower than the other functions, if you only want to fill with a color with blending.

Vladimir_Dudnik · ‎02-11-2009

Then you are correct, we do not have that specific functionality. As usually, please feel free to submit your feature request to Intel Premier Support, where it will be reviewed at the next IPP versions planning stage

Regards,
Vladimir