Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Bug report: ippiLUTPalette_8u_C1R

Alparslan_Y_
Beginner
699 Views

Hi, I have noticed a bug in ippiLUTPalette_8u_C1R function during my odd dimension image tests. I am using windows 64bit ipp v2018.3.210.

I have (w, h) = (1575, 1049), srcStep = 1664, dstStep = 1576, s_lut[256] is a uint8_t array defined in global scope.

Now calling

ippiLUTPalette_8u_C1R(srcData, srcStep, dstData, dstStep, IppiSize{w, h}, s_lut, 8);

results in wrong values except the first line of the image. It seems srcStep is larger than what this function can handle. On a side note, IppiCopy_8u_C1R works without any problem for the same data.

After I noticed that only the first row of the destination image is correct, I applied a simple workaround. Calling

for(int j = 0; j < h; j++)
    ippiLUTPalette_8u_C1R(srcData + j*srcStep, w, dstData + j*dstStep, w, IppiSize{w, 1}, s_lut, 8);

works without any problems and resulting values are as expected. Here I process row by row giving width as the step. I think step value is the problem.

On another note, I have checked the correctness of this function against simple C implementation as below

uint8_t *srcRow = srcData;
uint8_t *dstRow = dstData;

for(int j = 0; j < h; j++)
{
    for(int i = 0; i < w; i++)
    {
        dstRow = s_lut[ srcRow ];
    }
                        
    srcRow += srcStep;
    dstRow += dstStep;
}

Runtimes on Intel i7-8700K:

ippiLUTPalette_8u_C1R -> 980 us

Simple C loop byte-by-byte lookup -> 1070 us

Is this function really not optimized? or very unlikely but there are still some invalid operations even when applied row-by-row, so that it is becoming this slow? 1ms for going over the image only once is really too much. ippiCopy_8u_C1R on the exact same data takes 88 us (>11x faster).

0 Kudos
5 Replies
Jonghak_K_Intel
Employee
699 Views

Hi , 

 

have you tried to get status from the function? 

Would you try this and see what it says. ?

 status = ippiLUTPalette_8u32u_C1R( pSrc, srcStep, pDst, dstStep, roiSize, pTable, nBitSize);

 

and please refer this example page for the usage. 

https://software.intel.com/en-us/ipp-dev-reference-lutpalette-lutpaletteswap

 

also could you upload a reproducer for us that way we can actually check locally. 

 

 

Thank you 

0 Kudos
Pavel_B_Intel1
Employee
699 Views

Hi

Quick analysis shows the function shall be optimized. We will investigate this more deeply and return back with an answer about 2 weeks late.

Pavel

0 Kudos
Alparslan_Y_
Beginner
699 Views

JON J K. (Intel) wrote:

Hi , 

 

have you tried to get status from the function? 

Would you try this and see what it says. ?

 status = ippiLUTPalette_8u32u_C1R( pSrc, srcStep, pDst, dstStep, roiSize, pTable, nBitSize);

 

and please refer this example page for the usage. 

https://software.intel.com/en-us/ipp-dev-reference-lutpalette-lutpalette...

 

also could you upload a reproducer for us that way we can actually check locally. 

 

 

Thank you 

 

Yes, that is the first thing I do, and the function returns StsNoErr.

0 Kudos
Andrey_B_Intel
Employee
699 Views

Hi Alparslan Y.

I confirm your issue. Actually function uses the same srcStep for both source and destination images. We'll fix this issue in next release. Also function has optimization for nBitSize<=4. We consider if it is possible to add optimization for others bit size in next IPP releases.

Thanks for your feedback.

 

0 Kudos
Alparslan_Y_
Beginner
699 Views

Thank you very much, looking forward to the next release.

0 Kudos
Reply