Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Adventures with ippiFilterWiener (1)

Adriaan_van_Os
New Contributor I
472 Views
Having implemented ippiFilterWiener on 8u and 32f (see the next post) I turned to 16u. There is an ippiFilterWiener_16s but not an ippiFilterWiener_16u. Well, I thought, then we simply convert from 16u to 16s, and then back after the Wiener filter. I found ippiConvert_16s16u_C1Rs and ippiConvert_16u16s_C1RSfs. These are rather mysterious functions. The latter has a scaleFactor, but no combination of these functions (I think I tried all) converts between 16u and 16s without (severe) data loss. So, I looked at the Vector Initialization Functions and Essential Functions chapters for vector primitives to convert between 16u and 16s. No luck. Then I looked in the Vector Mathematical Functions chapter of the Intel® Math Kernel Library. No luck either ..... The saving idea then was to call ippiLUTPalette_16u with a LUT that consists of 32768 16-bit words equal to their index OR-ed with 0x8000 32768 16-bit words equal to their index CLEAR-ed with 0x8000 That finally worked, except that ippiLUTPalette_16u doesn't have an inline variant. Note that the 16u16s and 16s16u conversions have identical LUTs. The question remains what ippiConvert_16s16u and ippiConvert_16u16s really do and why so ?
0 Kudos
7 Replies
Igor_A_Intel
Employee
472 Views

Hi Adriaan van Os,

Both conversions 16u->16s and 16s->16u work according to the basic IPP concepts that are described in the signal processing manual (volume #1, chapter #2 Intel® Integrated Performance Primitives Concepts). I understand what you want - you want to perform the full map of 16u data type to 16s and then (after Wiener) back, so that 65535->32767, 32768->0, 0->-32768... There is no such operation in current IPP versions. The new functions that make such conversions possible will be available in IPP 2017 (available on the web in the autumn 2017):

// Purpose:       Converts data with scaling by formula: dst = src*Val + aVal
IPPAPI(IppStatus, ippiScaleC_16u16s_C1R, ( const Ipp16u* pSrc, int srcStep, Ipp64f mVal, Ipp64f aVal, Ipp16s* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI(IppStatus, ippiScaleC_16s16u_C1R, ( const Ipp16s* pSrc, int srcStep, Ipp64f mVal, Ipp64f aVal, Ipp16u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))

I think that using LUT for your purposes is not good idea as it's too slow function. I think it is significantly better/faster to convert to 32f, process Wiener in 32f, and then convert back to 16u (and of course it's better to do by slices with dst_slice_size + src_slice_size + tmp(for conversion)_slice_size less than L0 or L2 cache size, than several times drive through the memory bus the entire image). The internal calculations in FilterWiener are performed in 32f - therefore for 16s input anyway there is internal conversion to 32f.

Regards, Igor

0 Kudos
Adriaan_van_Os
New Contributor I
472 Views
Thanks for your reply. And it is certainly useful to know if a function executes floating-point internally. For other filters this may or may not be the case, see e.g. my comment on the IPP median filter. With regard to converting to and from floating-point, availability of conversion functions is crucual. For example, I wrote a tiled and vectorized function that does the following in one step 1. Get u16 or u8 (or f32) data for one channel out of a C4R or AC4R color image 2. Convert it to floating-point 3. Scale it down by a factor 65535 for U16 or 255 for u8 With IPP, these are three seperate function calls, requiring extra buffers there, namely for the conversion from C4R or AC4R to 1CR. Regards, Adriaan van Os
0 Kudos
Igor_A_Intel
Employee
472 Views

hi,

there is special set of functions in IPP for this purpose:

/* /////////////////////////////////////////////////////////////////////////////////
//  Name:       ippiScale
//
//  Purpose:   Scales pixel values of an image and converts them to another bit depth
//              dst = a + b * src;
//              a = type_min_dst - b * type_min_src;
//              b = (type_max_dst - type_min_dst) / (type_max_src - type_min_src).
//
//  Returns:
//    ippStsNullPtrErr      One of the pointers is NULL
//    ippStsSizeErr         roiSize has a field with zero or negative value
//    ippStsStepErr         One of the step values is less than or equal to zero
//    ippStsScaleRangeErr   Input data bounds are incorrect (vMax - vMin <= 0)
//    ippStsNoErr           OK
//
//  Parameters:
//    pSrc            Pointer  to the source image
//    srcStep         Step through the source image
//    pDst            Pointer to the  destination image
//    dstStep         Step through the destination image
//    roiSize         Size of the ROI
//    vMin, vMax      Minimum and maximum values of the input data (32f).
//    hint            Option to select the algorithmic implementation:
//                        1). hint == ippAlgHintAccurate
//                                  - accuracy e-8, but slowly;
//                        2). hint == ippAlgHintFast,
//                                 or ippAlgHintNone
//                                  - accuracy e-3, but quickly.
*/
IPPAPI ( IppStatus, ippiScale_8u16u_C1R, (const Ipp8u* pSrc, int srcStep, Ipp16u* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u16u_C3R, (const Ipp8u* pSrc, int srcStep, Ipp16u* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u16u_C4R, (const Ipp8u* pSrc, int srcStep, Ipp16u* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u16u_AC4R,(const Ipp8u* pSrc, int srcStep, Ipp16u* pDst, int dstStep, IppiSize roiSize ))

IPPAPI ( IppStatus, ippiScale_8u16s_C1R, (const Ipp8u* pSrc, int srcStep, Ipp16s* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u16s_C3R, (const Ipp8u* pSrc, int srcStep, Ipp16s* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u16s_C4R, (const Ipp8u* pSrc, int srcStep, Ipp16s* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u16s_AC4R,(const Ipp8u* pSrc, int srcStep, Ipp16s* pDst, int dstStep, IppiSize roiSize ))

IPPAPI ( IppStatus, ippiScale_8u32s_C1R, (const Ipp8u* pSrc, int srcStep, Ipp32s* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u32s_C3R, (const Ipp8u* pSrc, int srcStep, Ipp32s* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u32s_C4R, (const Ipp8u* pSrc, int srcStep, Ipp32s* pDst, int dstStep, IppiSize roiSize ))
IPPAPI ( IppStatus, ippiScale_8u32s_AC4R,(const Ipp8u* pSrc, int srcStep, Ipp32s* pDst, int dstStep, IppiSize roiSize ))

IPPAPI ( IppStatus, ippiScale_8u32f_C1R, (const Ipp8u* pSrc, int srcStep, Ipp32f* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))
IPPAPI ( IppStatus, ippiScale_8u32f_C3R, (const Ipp8u* pSrc, int srcStep, Ipp32f* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))
IPPAPI ( IppStatus, ippiScale_8u32f_C4R, (const Ipp8u* pSrc, int srcStep, Ipp32f* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))
IPPAPI ( IppStatus, ippiScale_8u32f_AC4R,(const Ipp8u* pSrc, int srcStep, Ipp32f* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))

IPPAPI ( IppStatus, ippiScale_16u8u_C1R, (const Ipp16u* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_16u8u_C3R, (const Ipp16u* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_16u8u_C4R, (const Ipp16u* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_16u8u_AC4R,(const Ipp16u* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))

IPPAPI ( IppStatus, ippiScale_16s8u_C1R, (const Ipp16s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_16s8u_C3R, (const Ipp16s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_16s8u_C4R, (const Ipp16s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_16s8u_AC4R,(const Ipp16s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))

IPPAPI ( IppStatus, ippiScale_32s8u_C1R, (const Ipp32s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_32s8u_C3R, (const Ipp32s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_32s8u_C4R, (const Ipp32s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))
IPPAPI ( IppStatus, ippiScale_32s8u_AC4R,(const Ipp32s* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, IppHintAlgorithm hint ))

IPPAPI ( IppStatus, ippiScale_32f8u_C1R, (const Ipp32f* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))
IPPAPI ( IppStatus, ippiScale_32f8u_C3R, (const Ipp32f* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))
IPPAPI ( IppStatus, ippiScale_32f8u_C4R, (const Ipp32f* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))
IPPAPI ( IppStatus, ippiScale_32f8u_AC4R,(const Ipp32f* pSrc, int srcStep, Ipp8u* pDst, int dstStep, IppiSize roiSize, Ipp32f vMin, Ipp32f vMax ))

and I'm curious what is the purpose to convert from AC4R or C4R to C1R and back - for such operation as scaling or conversion you always can address any-number-of-channels image as C1R just multiplying its width by a number of channels.

regards, Igor

0 Kudos
Adriaan_van_Os
New Contributor I
472 Views

Thanks Igor for bringing ippiScale this to my attention. Yes, for 8u this is a possibilty, but I don't see a 16u32f variant .....

With regard to the conversion from AC4R or C4R to C1R, there are several reasons

1. One doesn't want to process the alpha channel (just costs extra time, for example with a filter like ippFilterBilateral that is inherently slow and only exists as C1R and C3R)

2. The user can manipulate RGB channels separately in the software

3. RGB is first converted to luminance and chroma channels and then those luminance and chroma channels are treated differently. Human perception of luminance and chroma channels are quite different, so this is an essential aspect of image filtering. The same could be true for the conversion to HSV, HLS, Lab, etcetera

4. There are several steps involved in manipulating the image and for locality one wants to do that channel by channel.

Regards,

Adriaan van Os

0 Kudos
Adriaan_van_Os
New Contributor I
472 Views

5. One more reason (albeit a bit exotic). In astronomy, often a stack of images is aligned to improve the signal-to-noise-ratio. And the RGB channels can be quite different (not counting the case where channels are actual different shots with different physical filters). This is so because of (1) the different wavelengths of Red, Green and Blue and consequently the different sizes of their Airy disks  and (2) the different diffraction of Red, Green and Blue channels in the asmosphere, depending on the specifics of the asmospheric conditions under which the images were taken. In short, you want to choose one specific channel to do the alignment of the stack of images on. So, you need either a Stride parameter for your vector operations or a conversion from (A)4CR to 1CR.

Regards,

Adriaan van Os

0 Kudos
Adriaan_van_Os
New Contributor I
472 Views

7. And for filters that operate on a neighbourhood (like ippiFilterBilateralBorder) one can not use the "mutiplicate width by number of  channels" trick. The fact that the trick works for the conversion as such is irrelevant since the result of the conversion must be 1CR (or a useable Stride parameter).

Regards,

Adriaan van Os

0 Kudos
Igor_A_Intel
Employee
472 Views

agree, in your case it doesn't work, but for some operations treatment of CXR image as C1R with width*X works well.

regards, Igor

0 Kudos
Reply