Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Core2Dual vs. Celeron -> Calculationtime!?!

gregsch
Beginner
486 Views

Hi!

I'm still working on a IPP-performance Test..

In this case I checked some division functions.

(look at the following code example.)

The compiling mode was "singlethreaded". The executable Programm

was checked on a Celeron and a Core2Dual processor.

The dual processor takes 83 microsec calculationtime, the Celeron

takes 9microsec calculationtime.

(In Multithreading-mode Core2Dual takes 4 microsec)

Can you explain the matters (83microSec???!?!!) of this

effect to me?

Is there any possibility to increase the dualcore

calculation time?

Thanks-Gregor

#include

#include

"Windows.h"

#include

"conio.h"

#include

"ippcore.h"

#include

"ipp.h"

#include

"ipps.h"

#include

"ippi.h"

#include

#include

#include

"ScopeTimer.h"

void

main( void )

{

try

{

int num=
1024;

Ipp8u

* ippsptrsrc1 = ippsMalloc_8u(num);

Ipp8u

* ippsptrsrc2 = ippsMalloc_8u(num);

Ipp8u

* ippsptrdst = ippsMalloc_8u(num);

ippsSet_8u

(
100,ippsptrsrc1, num);

ippsSet_8u

(
100,ippsptrsrc2, num);

IppStatus tempIppStatus

= ippsDiv_8u_Sfs(ippsptrsrc1, ippsptrsrc2, ippsptrdst, num, -
1);

if(tempIppStatus!=ippStsNoErr)

{

throw std::string(ippGetStatusString( tempIppStatus));

}

double dt=
0;

{

ScopeTimer timer

(dt);

ippsDiv_8u_Sfs

(ippsptrsrc1, ippsptrsrc2, ippsptrdst, num,
1);

}

ippsFree

(ippsptrsrc1);

ippsFree

(ippsptrsrc2);

ippsFree

(ippsptrdst);

std

::cout<<
"calctime "<< dt<<" mikroSec"<<std::endl;

}
//try

catch( std::string& e )

{

std

::cerr << "String exception: " << e << std::endl;

}

catch( ... )

{

std

::cerr << "unhandled exception" << std::endl;

}

getch

();

}

class

ScopeTimer

{

public

:

ScopeTimer

(double& outmicrosecs) : microsecsElapsed_(outmicrosecs)

{

QueryPerformanceFrequency

(&frequency_);

QueryPerformanceCounter

(&countStart_);

}

~ScopeTimer()

{

QueryPerformanceCounter

(&countStop_);

const double countDif
= static_cast< double >(countStop_.QuadPart-countStart_.QuadPart);

microsecsElapsed_

= static_cast<int>((1000000*((double)(1.0/(double)frequency_.QuadPart))*countDif)+.5);

}

private

:

double& microsecsElapsed_;

LARGE_INTEGER countStart_

;

LARGE_INTEGER countStop_

;

LARGE_INTEGER frequency_

;

};

0 Kudos
5 Replies
Vladimir_Dudnik
Employee
486 Views

Hello,

what do you mean exactly when saying single thread mode or multithread mode?

InIPP 5.3.1 we provide DLLs which are threaded inside, static libraries which are not threaded and static libraries which are threaded. Which libraries did you use? What compiler did you use to build your test?

In any case, it is dangerous to link in single thereaded application modules which are threaded, there are important differences in C run time which may cause issue.

Regards,
Vladimir

0 Kudos
gregsch
Beginner
486 Views

Hi! Saying single thread mode or multithread mode means the usage of Runtimelibraries in the Project.(ProcectProperties->C/C++->code assembly->Runtimelibraries(Multithreaded/Singlethreaded)). Im using visual C++ Version7.1.30.88(2003).

Im confused about the increasing of calculating time.Multithread isdisabled.Why does the dual core machine take such a long time according to the Celeron machine (approx. FPU or similar isretarded??)Especially the division operations show this effect?! Other threaded functions result a expectet calculating time.

Regard, Greg

PS.: I can send a calculating time analysis, showing this effect.

0 Kudos
Vladimir_Dudnik
Employee
486 Views

As I've saidin previous post, itis incorrect to link multi-threaded libraries (I assume you link with IPP DLLs) and single-threaded run-time libraries in one executable.

You'd better link with IPP not-threaded static libraries if you need really not-threaded application.

Vladimir

0 Kudos
gregsch
Beginner
486 Views

Hello!
Another question in this case:
I want to use only one core on my Dualmachine. The seond core
should be unused.
The first lines of my main
programm(see above) look like this:

DWORD processorBitMask = 3;
BOOL success = ::SetProcessAffinityMask(::GetCurrentProcess(), processorBitMask );
SetProcessPriorityBoost( ::GetCurrentProcess(), TRUE );

I'll do a multithreaded linking.
Now my System shoult use only one of the two cores for my division.
So the expectet calculationtime should be anything like on a one core
System(for example Celeron = 9microsec).
But the calculationtime takes also somthing like 84 mircosec...
Why does the Calculation take such a long time?!

Greets Gregor

0 Kudos
Ivan_Z_Intel
Employee
486 Views

IPP is based on OMP that knows nothing about your manipulations with thread affinity. If you want to run your app (IPP) in one thread - use specialy intended for it function - ippSetNumThreads(1), please.

#include

#include

"Windows.h"

#include

"conio.h"

#include

"ippcore.h"

#include

"ipp.h"

#include

"ipps.h"

#include

"ippi.h"

#include

#include

//#include "ScopeTimer.h"

class

ScopeTimer

{

public

:

ScopeTimer(

double& outmicrosecs) : microsecsElapsed_(outmicrosecs)

{

QueryPerformanceFrequency(&frequency_);

QueryPerformanceCounter(&countStart_);

}

~ScopeTimer()

{

QueryPerformanceCounter(&countStop_);

const

double countDif = static_cast< double >(countStop_.QuadPart-countStart_.QuadPart);

microsecsElapsed_ =

static_cast<int>((1000000*((double)(1.0/(double)frequency_.QuadPart))*countDif)+.5);

}

private

:

double

& microsecsElapsed_;

LARGE_INTEGER countStart_;

LARGE_INTEGER countStop_;

LARGE_INTEGER frequency_;

};

void

main( void )

{

try

{

int

num= 1024;

Ipp8u* ippsptrsrc1 = ippsMalloc_8u(num);

Ipp8u* ippsptrsrc2 = ippsMalloc_8u(num);

Ipp8u* ippsptrdst = ippsMalloc_8u(num);

ippsSet_8u(100,ippsptrsrc1, num);

ippsSet_8u(100,ippsptrsrc2, num);

ippSetNumThreads( 1 );

IppStatus tempIppStatus= ippsDiv_8u_Sfs(ippsptrsrc1, ippsptrsrc2, ippsptrdst, num, -1);

if

(tempIppStatus!=ippStsNoErr)

{

throw

std::string(ippGetStatusString( tempIppStatus));

}

double

dt=0;

{

ScopeTimer timer(dt);

ippsDiv_8u_Sfs(ippsptrsrc1, ippsptrsrc2, ippsptrdst, num, 1);

}

ippsFree(ippsptrsrc1);

ippsFree(ippsptrsrc2);

ippsFree(ippsptrdst);

std::cout<<

"calctime "<< dt<<" mikroSec"<<:ENDL>

int

numThr;

ippGetNumThreads(&numThr);

}

//try

catch

( std::string& e )

{

std::cerr <<

"String exception: " << e << std::endl;

}

catch

( ... )

{

std::cerr <<

"unhandled exception" << std::endl;

}

getch();

}

The result of this code executionon Core 2 Duo machine is 5 - 7 microSec.

IZ

0 Kudos
Reply