- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I am comparing ippmSub_vav_64f with a regular loop implementation. The ipp result is significantly slower.

Compile line:

icpc -O3 -ipp=common test.cpp

results:

Ipp Time (uSec):1228

Regular Loop Time (uSec):881

Runs on:

Hi All,

I am comparing ippmSub_vav_64f with regular loop. The ipp result is significantly slower.

Compile line:

icpc -O3 -ipp=common test.cpp

results:

Ipp Time (uSec):1228

Regular Loop Time (uSec):881

Runs on:

Intel Xeon CPU X5550 @ 2.67GHz

Code:

#include

#include

#include

#include

#define VEC_SIZE 20000

#define DIM 3

#define REPEAT_SIZE 10

int main(){

// Output Array

double *aIpp=new double[VEC_SIZE*DIM];

double *aLoop=new double[VEC_SIZE*DIM];

// Rand Arrays

double *temp_a=new double[VEC_SIZE*DIM];

double *temp_b=new double[DIM];

unsigned int seed=5;

int j,d,i;

int stride0=sizeof(double),stride2=VEC_SIZE*sizeof(double);

// Timing Vars

timeval startTime;

timeval endTime;

double tS,tE;

// Draw Arrays

ippsRandUniform_Direct_64f(temp_a, VEC_SIZE*DIM,0,1000,&seed);

ippsRandUniform_Direct_64f(temp_b, DIM,0,1000,&seed);

// IPP Sub

gettimeofday(&startTime, NULL);

for (j=0; j

}

gettimeofday(&endTime, NULL);

tS = startTime.tv_sec*1000000 + (startTime.tv_usec);

tE = endTime.tv_sec*1000000 + (endTime.tv_usec);

std::cout<< "Ipp Time (uSec):" << (tE-tS) << "\\n";

// Regular Sub

gettimeofday(&startTime, NULL);

for (j=0; j

}

}

gettimeofday(&endTime, NULL);

tS = startTime.tv_sec*1000000 + (startTime.tv_usec);

tE = endTime.tv_sec*1000000 + (endTime.tv_usec);

std::cout<< "Regular Loop Time (uSec):" << (tE-tS) << "\\n";

for (i=0;i

*-aIpp*

*)>0.0001) std::cout <<"Error";*

}

Any thoughts?

Thanks !

Snir

}

Any thoughts?

Thanks !

Snir

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

It looks that you are computing the vector with size 20000. Actually, IPP MX functions are optimized for operations on small matrices and small vectors, particularly for matrices of size 3x3, 4x4, 5x5, 6x6, and for vectors of length 3, 4, 5, 6.

For the simple C code you test, the Compiler can easily vectorize the code, and get good performance.

Thanks,

Chao

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I think that the setting fits the IPP MX target

Thanks

Snir

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Snir,

In your code, the following inner loops take most the time. It just sub a constant temp_b

for (i=0;i

aLoop[i+VEC_SIZE*d]=temp_a[i+VEC_SIZE*d]-temp_b

}

Actually good replacement for such code is use the following IPP function call:

ippsSubC_64f(...).

Thanks,

Chao

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page