Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

ipp9 much slower than ipp8?

alon_b_
Beginner
614 Views

Hey, I've updated my code (and compiler) to the 2016 studio, it seems to run a lot slower, after using vtune I see that the ipp functions are much slower, for example, where ippiMean took 0.020s it now takes 0.132s, making my program run a LOT slower, am I doing something wrong? is this known?

0 Kudos
5 Replies
alon_b_
Beginner
614 Views

I have the following sample code:

#include <iostream>
#include "ippi.h"
#include "ippcore.h"
using namespace std;
IppStatus mean( void ) {
    Ipp64f mean;
    Ipp8u x[5*4];
    IppiSize roi = {5,4};
    ippiSet_8u_C1R( 3, x, 5, roi );
    return ippiMean_8u_C1R( x, 5, roi, &mean );
}
void func_normdiff_l1()
{
    Ipp8u pSrc1[8*4];
    Ipp8u pSrc2[8*4];  
    Ipp64f Value;
    int src1Step = 8;
    int src2Step = 8;
    IppiSize roi = {8,4};
    IppiSize roiSize = {5,4};

    ippiSet_8u_C1R(1, pSrc1, src1Step, roi);
    ippiSet_8u_C1R(2, pSrc2, src2Step, roi);

    ippiNormDiff_L1_8u_C1R( pSrc1, src1Step, pSrc2, src2Step, roiSize, &Value);
}

int main(int argc, char* argv[]) {
    ippInit();
    for (int i = 0 ; i < 1000000 ; ++i) {
        mean();
        func_normdiff_l1();
    }
}

compiling and running with:

icpc -o main main.cpp -lippi && time ./main

 

with 2013:

real    0m0.281s
user    0m0.280s
sys    0m0.000s

with 2016:

real    0m0.574s
user    0m0.569s
sys    0m0.005s

 

I would love to understand why 2016 is so much slower...

 

Thanks!

 

 

0 Kudos
Gennady_F_Intel
Moderator
614 Views

the problem size is very tiny in your case and using Vtune for make such measurement is not correct. if you want to make direct performance measurement you may try to call ippGetCpuClocks before and after ippiMean_*.*. but I would recommend you or to use IPP perfsys tool to compare the performance on your system, with current version of IPP

 

 

0 Kudos
alon_b_
Beginner
614 Views

Hey Genady, Thanks for your response.

I have ipp running on very big samples, and I noticed the slowdown there, I've added this example to show a simple reproducable code that ipp9 runs halfs as fast, (500mili is a lot, I doubt it's in the error area).

worth noting we've also tested to see which architecture ipp8 & 9 took and they both correctly selected the same.

0 Kudos
Gennady_F_Intel
Moderator
614 Views

Ok, thanks for the update. I have two more questions: 1) What is the CPU type you are working on?  you may add the code below to check what specific branch of IPP has been called.

void libinfo(void) {
         const IppLibraryVersion* lib = ippiGetLibVersion();
         printf(“%s %s %d.%d.%d.%d\n”, lib->Name, lib->Version,
            lib->major, lib->minor, lib->majorBuild, lib->build);
      }

and 2) what is the "I have ipp running on very big sample"  size?  

thanks, Gennady

0 Kudos
Gennady_F_Intel
Moderator
614 Views

we reproduced the problem on our side - the issue is caused in mul_32fc function. We are planing to fix it the next update  of IPP. 

0 Kudos
Reply