- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey, I've updated my code (and compiler) to the 2016 studio, it seems to run a lot slower, after using vtune I see that the ipp functions are much slower, for example, where ippiMean took 0.020s it now takes 0.132s, making my program run a LOT slower, am I doing something wrong? is this known?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have the following sample code:
#include <iostream>
#include "ippi.h"
#include "ippcore.h"
using namespace std;
IppStatus mean( void ) {
Ipp64f mean;
Ipp8u x[5*4];
IppiSize roi = {5,4};
ippiSet_8u_C1R( 3, x, 5, roi );
return ippiMean_8u_C1R( x, 5, roi, &mean );
}
void func_normdiff_l1()
{
Ipp8u pSrc1[8*4];
Ipp8u pSrc2[8*4];
Ipp64f Value;
int src1Step = 8;
int src2Step = 8;
IppiSize roi = {8,4};
IppiSize roiSize = {5,4};
ippiSet_8u_C1R(1, pSrc1, src1Step, roi);
ippiSet_8u_C1R(2, pSrc2, src2Step, roi);
ippiNormDiff_L1_8u_C1R( pSrc1, src1Step, pSrc2, src2Step, roiSize, &Value);
}
int main(int argc, char* argv[]) {
ippInit();
for (int i = 0 ; i < 1000000 ; ++i) {
mean();
func_normdiff_l1();
}
}
compiling and running with:
icpc -o main main.cpp -lippi && time ./main
with 2013:
real 0m0.281s
user 0m0.280s
sys 0m0.000s
with 2016:
real 0m0.574s
user 0m0.569s
sys 0m0.005s
I would love to understand why 2016 is so much slower...
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the problem size is very tiny in your case and using Vtune for make such measurement is not correct. if you want to make direct performance measurement you may try to call ippGetCpuClocks before and after ippiMean_*.*. but I would recommend you or to use IPP perfsys tool to compare the performance on your system, with current version of IPP
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Genady, Thanks for your response.
I have ipp running on very big samples, and I noticed the slowdown there, I've added this example to show a simple reproducable code that ipp9 runs halfs as fast, (500mili is a lot, I doubt it's in the error area).
worth noting we've also tested to see which architecture ipp8 & 9 took and they both correctly selected the same.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks for the update. I have two more questions: 1) What is the CPU type you are working on? you may add the code below to check what specific branch of IPP has been called.
void libinfo(void) { const IppLibraryVersion* lib = ippiGetLibVersion(); printf(“%s %s %d.%d.%d.%d\n”, lib->Name, lib->Version, lib->major, lib->minor, lib->majorBuild, lib->build); }
and 2) what is the "I have ipp running on very big sample" size?
thanks, Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we reproduced the problem on our side - the issue is caused in mul_32fc function. We are planing to fix it the next update of IPP.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page