the simplest workaround is to use ippSP functions - for each image row:
IPPAPI(IppStatus, ippsAdd_32s_ISfs, (const Ipp32s* pSrc, Ipp32s* pSrcDst, int len, int scaleFactor))
IPPAPI(IppStatus, ippsAdd_32s_Sfs, (const Ipp32s* pSrc1, const Ipp32s* pSrc2, Ipp32s* pDst, int len, int scaleFactor))
IPPAPI(IppStatus, ippsAdd_32sc_ISfs, (const Ipp32sc* pSrc, Ipp32sc* pSrcDst, int len, int scaleFactor))
IPPAPI(IppStatus, ippsAdd_32sc_Sfs, (const Ipp32sc* pSrc1, const Ipp32sc* pSrc2, Ipp32sc* pDst, int len, int scaleFactor))
Thanks for your information. Yes, I have tried to use ippsAdd, but it is much slower compared to directly using the SIMD instruction.
Is there any specific reason why the intel does not provide the ippiAdd for the 32s? It provides the function to allocate the memory, but there is no much basic operation to support the Ipps32 for image.
You may submit the Feature Request to the Intel Online service center :
OSC - https://supporttickets.intel.com/?lang=en-US
and here is how to create the access: https://software.intel.com/en-us/articles/how-to-create-a-support-reques...