Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Sample-generating functions in ipps

j3r3mi
Beginner
670 Views
I tested out and benchmark the uniform and Gaussian distribution functions using OpenCV 2.0a and IPP 6.1
I could observe great timing timing improvement for uniform rand for IPP as compared to OpenCV 2.0 cv::randu.
However, it seems that IPP's Gaussian rand is performing very much slower than OpenCV 2.0 cv::randn.
I tested out with vectors of length 2000.
Did anyone have the same observation as me??
I did a test program below.
[bash]// test.cpp
// OpenCV 2.0 libs
#ifdef _DEBUG
   #pragma comment(lib, "cv200d.lib")
   #pragma comment(lib, "cxcore200d.lib")
   #pragma comment(lib, "ml200d.lib")
#else
   #pragma comment(lib, "cv200.lib")
   #pragma comment(lib, "cxcore200.lib")
   #pragma comment(lib, "ml200.lib")
#endif

// IPP 6.1 libs : static-merged linking.
#pragma comment(lib, "ippcorel.lib")
#pragma comment(lib, "ippiemerged.lib")
#pragma comment(lib, "ippimerged.lib")
#pragma comment(lib, "ippcvemerged.lib")
#pragma comment(lib, "ippcvmerged.lib")
#pragma comment(lib, "ippsemerged.lib")
#pragma comment(lib, "ippsmerged.lib")
#pragma comment(lib, "ippvmemerged.lib")
#pragma comment(lib, "ippvmmerged.lib")
#pragma comment(lib, "ippmemerged.lib")
#pragma comment(lib, "ippmmerged.lib")

#include 
#include 
#include "ipp.h"
#include "cv.h"
#define N    2000

void main(void)
{
    cv::Mat_ mtSamples(1, N);
    double          t         = 0;
    double          f         = cv::getTickFrequency();
    unsigned int    seed      = (unsigned int)time(NULL);
    float*          pSamples  = (float*)mtSamples.datastart;[/bash]
[bash]    IppStatus status = ippInit();[/bash]
[bash]    printf("Ipp Init() - %s\n", ippGetStatusString(status));
    
    // IPP uniform random.
    t = (double)cv::getTickCount();
    ippsRandUniform_Direct_32f(pSamples, N, 0, 1, &seed);
    t = (double)cv::getTickCount() - t;
    printf("IPP Uniform rand : %f ms\n", t * 1000 / f);

    // OpenCV uniform random.
    t = (double)cv::getTickCount();
    cv::randu(mtSamples, 0, 1);
    t = (double)cv::getTickCount() - t;
    printf("OpenCV Uniform rand : %f ms\n", t * 1000 / f);

    // IPP Gaussian random.
    t = (double)cv::getTickCount();
    ippsRandGauss_Direct_32f(pSamples, N, 0, 1, &seed);
    t = (double)cv::getTickCount() - t;
    printf("IPP Gaussian rand : %f ms\n", t * 1000 / f);

    // OpenCV Gaussian random.
    t = (double)cv::getTickCount();
    cv::randn(mtSamples, 0, 1);
    t = (double)cv::getTickCount() - t;
    printf("OpenCV Gaussian rand : %f ms\n", t * 1000 / f);[/bash]
[bash]}[/bash]
[bash]
[/bash]
Am i missing something??
Thanks in advance!
Br.
Jeremy
0 Kudos
6 Replies
Chao_Y_Intel
Moderator
670 Views

Jeremy,

For Gaussian random, what is the performance data for IPP and OpenCV in your test?

In my system (Intel Core 2 Duo processor), IPP Gaussian random take about 290 clock tickets for each element. Is this consistent with your test?

Thanks,

Chao

0 Kudos
apolo74
Beginner
670 Views
Hi guys,
I tested the code from Jeremy and it seems he is right, I get a better performance in OpenCV compared to IPP. Don't know why this is since,from what I understand OpenCV is supposed to be based in IPP. It has probably something to do with initializations. In any case I'd like to have a solution to this as soon as possible since I'm working with vectors of around 20000 elements and for this IPP takes almost 3 times longer than OpenCV. I'll be grateful if someone gives me a good advice on how to speed up IPP.
Boris
0 Kudos
Vladimir_Dudnik
Employee
670 Views
Hello,

please make sure you call ippInit function before any call to IPP functions when you use static linkage with IPP. That will initialize IPP dispatcher and will ensure the most optimized code will be scheduled to run in IPP library.

Regards,
Vladimir
0 Kudos
apolo74
Beginner
670 Views

Hi Vladimir,as you can see in the code, we are using ippInit()... just hope this is the right way of doing it, please correct us if this is wrong. Could you try to test this for yourself? I'm attaching the code that I'm running and the results (note that I'm using 20000 elements):[bash]// testRand.cpp #include #include #include "ipp.h" #include "cv.h" #include "tbb/tick_count.h"#define N 20000using namespace std;int main(void){ cv::Mat mtSamples(1, N, CV_32FC1 ); double t = 0; double f = cv::getTickFrequency(); unsigned int seed = (unsigned int)time(NULL); float* pSamples = (float*)mtSamples.datastart; IppStatus status = ippInit(); cout << "Ipp Init() - " << ippGetStatusString(status) << endl; // IPP uniform random. tbb::tick_count tStart = tbb::tick_count::now(); ippsRandUniform_Direct_32f(pSamples, N, 0, 1, &seed); tbb::tick_count tEnd = tbb::tick_count::now(); cout << "IPP Uniform rand:\t" << 1000*(tEnd-tStart).seconds() << " ms" << endl; // OpenCV uniform random. tStart = tbb::tick_count::now(); cv::randu(mtSamples, 0, 1); tEnd = tbb::tick_count::now(); cout << "OpenCV Uniform rand:\t" << 1000*(tEnd-tStart).seconds() << " ms" << endl; // IPP Gaussian random. tStart = tbb::tick_count::now(); ippsRandGauss_Direct_32f(pSamples, N, 0, 1, &seed); tEnd = tbb::tick_count::now(); cout << "IPP Gaussian rand:\t" << 1000*(tEnd-tStart).seconds() << " ms" << endl; // OpenCV Gaussian random. tStart = tbb::tick_count::now(); cv::randn(mtSamples, 0, 1); tEnd = tbb::tick_count::now(); cout << "OpenCV Gaussian rand:\t" << 1000*(tEnd-tStart).seconds() << " ms" << endl;return 0;}[/bash]Results (Intel Core2 Duo CPU T9550 @ 2.66GHz, Ubuntu 10.04 ) :Ipp Init() - ippStsNoErr: No error, it's OKIPP Uniform rand: 0.119987 msOpenCV Uniform rand: 0.142476 msIPP Gaussian rand: 0.633181 msOpenCV Gaussian rand: 0.240254 msBoris

0 Kudos
Chao_Y_Intel
Moderator
670 Views

Hello Boris,

Thanks for the report. The problem is caused by different algorithm used by Intel IPP and OpenCV.

By looking at the OpenCV source:

https://code.ros.org/trac/opencv/browser/trunk/opencv/src/cxcore/cxrand.cpp?rev=2059

It looks it use "The Ziggurat Method for Generating Random Variables". Intel IPP take Marsaglia polar method.

Actually, there are many different random generation methord. Intel MKL library provides a many algorithms in VSL functions.
http://software.intel.com/en-us/articles/intel-mkl-vmlvsl-training-material/
Different methord have different performance, rand number quality, threading capability, etc. We you want to specific performance, or quality, VSL function can provide more rich choices.

Our engineer will also have a look at the Ziggurat algorithm, see if this could be a better choice for the current implementation.

Thanks,
Chao

0 Kudos
apolo74
Beginner
670 Views
Hi Chao and thanks for the explanation, it feels much better knowing the causes of some specific problem. I hope it is possible to improve this IPP function since it has not much sense to call OpenCV only to generate a random vector. Thanks again for the explanation and I'll be waiting for a new implementation of this function.
Boris
0 Kudos
Reply