Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

IPP function freezes system

Joachim_Grimminger
351 Views
hi,
i am using the ipp to calculate a fft.
The fft-process runs 3 times parallel with priority 16 (REALTIME_PRIORITY_CLASS) an uses the ipp-functions.
A part of the calculation is the ippsMagnitude_32fc(). I replacd the magnitude calculation with a step-by-step calculation with a ippsqr, ippadd and ippsqrt function.

In case of calling only the ippMagnitude_32fc() my system freezes (even processes with priority 22).
Systempower rises up to 100%.

In the step-by-step case - no process freezesand the fft-calculation only need 10% (20ms) of the time it needs with the ippmagnitude-function - but my calculation-process needs more time at other parts of the programm - it seems the process has to be in a "waiting" mode. Systempower is only at max. 50 %.

Can anyone explain the reason of the function-timing/priority - or how to optmimize in this case !?

some data:
sysetem: winXP
ipp-includes: ipps, ippcore, ippvm

0 Kudos
8 Replies
igorastakhov
New Contributor II
352 Views
Hi,

please provide additional data (or better - reproducible example): ia32 or x64, library version (6.1, 7.0.x), static (threaded or not) or dynamic linking, library letter (w7, v8, etc.), FFT order, vector length for Magnitude.

Regards,
Igor
0 Kudos
Joachim_Grimminger
352 Views

hi,

i tried to make a small version example of the magnitude calculation of the fft.
The sample programm creates a complex sample vektor - and calculates the magnitude of it.
If you run this code -with priority 16 - it blocks other processes with higher priority - but itself is nocht blocket - it keeps running with10-12 ms. So is it possible that the ipp-functions uses their "own" priority ?
The default thread priority should by "normal" !

void main(int argc, char *argv[], char *env[] )

{

Ipp32f *pReal;

Ipp32f *pMagn;

Ipp32fc *pCplx;

int ippstate = 0;

// Allocate memory for the arrays...
pReal = (Ipp32f *)ippMalloc(sizeof(Ipp32f) * len);

pMagn = (Ipp32f *)ippMalloc(sizeof(Ipp32f) * len);

pCplx = (Ipp32fc *)ippMalloc(sizeof(Ipp32fc) * len);

// Set a const. value to the real vector...
ippstate = ippsSet_32f( (Ipp32f)1.234, pReal, len );

// Create just a simple complex vector ( 1.234, 0.000 )...
ippstate = ippsRealToCplx_32f( pReal, NULL, pCplx, len );

while( 1 ){

// Calculate the magnitude of the complex vector...
ippstate = ippsMagnitude_32fc( pCplx, pMagn, len );

Sleep(10);
}

ippFree( pReal );
ippFree( pMagn );
ippFree( pCplx );

}



FFT-Settings: ippsFFTInitAlloc_C_32fc(&pFFTSet, order /*12*/, IPP_FFT_NODIV_BY_ANY, ippAlgHintNone);

additional data:
using ia32
lib version: 7.0.205.993 (e.g. ipps)
vector length max. 2048
fft order 12

0 Kudos
igorastakhov
New Contributor II
351 Views
You said nothing about linking model. Guess you are using dynamic or threaded static linking and I think that all problems with prioryty settings are because of OMP that is used for threaded version of library. My proposal is to link with with non-threaded static library - it will work properly with any priority.

Regards,
Igor
0 Kudos
Joachim_Grimminger
351 Views

So you mean - creating a single-threaded programm to calculate the fft - with static linked ipp libraries ?
(for information i am using visual studio 2008, current settings are multithreaded)

0 Kudos
igorastakhov
New Contributor II
351 Views
All my words above are about IPP libraries only. I'm suggesting you to use static linking of non-threaded static IPP libraries. ippsMagnitude_32fc function for dynamic and threaded-static libraries is threaded with OMP - to my understanding it may be the only reason of hanging-up when running with non-default priority.

Regards,
Igor
0 Kudos
SergeyKostrov
Valued Contributor II
352 Views

Hi Joachim,

I hope that it is not too late and here are some results of my investigation:

First of all it wasn't clear from your initial post how you were changing a process priority?
That is, manually using the 'Task Manager' or programmatically using Win32 API functions
'SetPriorityClass' or 'SetThreadPriority'.

>>...In case of calling only the ippMagnitude_32fc() my system freezes...

In your test-case inside a 'while( 1 )' loop there is a call to Win32 API function 'Sleep' with a
delay of 10ms. Did you try to comment it? If No, please try it. I can tell you that I reproduced a
'Complete-System-Freeze' when 'Sleep( 10 )' is called after the priority of the process was changed
to Real-Time. I had to press on a Reset button to restart a computer with Windows XP.


There are No any problems with 'ippMagnitude_32fc' IPP API function. This is a very old function,
I estimate it is at least ~12 years old, and it is hard to believe it has some problems.


Since you've changed a priority of the process to Real-Time a call 'Sleep( 10 )' creates some problems.

Do you think the Real-Time process will give a chance to another processes to be executed?

>>...In the step-by-step case - no process freezes...

I could guess that you've done everything correctly.

>>...my calculation-process needs more time at other parts of the programm...

Let's take look at performance numbers from my tests ( a screenshoot is enclosed as well ):

Note: Number of iterations for all tests is 4,194,304;

Real-Time priority - 48.359 secs ( fastest )
High priority - 48.765 secs
Normal priority - 49.250 secs
Idle priority - 49.438 secs ( slowest )

If we calculate performance increase numbers it is clear that:

Real-Time process is ~2.18% faster than Idle process, and
Real-Time process is ~1.81% faster than Normal process.

Simply take into account these numbers.

>>...is it possible that the ipp-functions uses their "own" priority?

I don't know.

>>...Can anyone explain the reason of the function-timing/priority - or how to
>>optmimize in this case?..

Please take a look at MSDN because I simply don't want to repeat what Microsoft's developers
already described. Subjects, key words, function names, etc, are as follows:

Platform SDK DLLs, Processes, and Threads
Scheduling Priorities
Priority Boosts
Win32 API functions 'Sleep', 'SleepEx','SetThreadPriority' and 'SetPriorityClass'

Best regards,
Sergey



0 Kudos
SergeyKostrov
Valued Contributor II
352 Views

1. Performance of two IPP functions for calculating the Magnitude is as follows:

ippsMagnitude_32fc(...) is faster (in ~1.6 times! )than ippsMagnitude_32f(...)

2. Performance statistics for different IPP's DLLs:

ippsw7.dll - 48,328 secs
ippsa6.dll - 48,750 secs
ippsm6.dll - 125,796 secs
ippspx.dll - 206,891 secs

3. You could improve a system responsiveness by calling:

::SetThreadPriority( ::GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL );

instead of:

::SetPriorityClass( ::GetCurrentProcess(), REALTIME_PRIORITY_CLASS );

0 Kudos
SergeyKostrov
Valued Contributor II
352 Views
My Test-Case is based on the initial Joachim'sTest-Casefrom one of the first posts andis provided AS IS.
So, you'll need to modify it if interested. Good luck!

...

RTint iIppState = 0;
Ipp32f *pfVecRe = RTnull;
Ipp32f *pfVecIm = RTnull;
Ipp32f *pfVecMg = RTnull;
Ipp32fc *pfcVecCx = RTnull;
RTint iVecSize = 2048;

//int iSizeIpp32f = sizeof( Ipp32f ); // 4 bytes
//int iSizeIpp32fc = sizeof( Ipp32fc );// 8 bytes

// Allocate memory for the Vectors
pfVecRe = ( Ipp32f * )::ippsMalloc_32f( sizeof( Ipp32f ) * iVecSize );
pfVecIm = ( Ipp32f * )::ippsMalloc_32f( sizeof( Ipp32f ) * iVecSize );
pfVecMg = ( Ipp32f * )::ippsMalloc_32f( sizeof( Ipp32f ) * iVecSize );
pfcVecCx = ( Ipp32fc * )::ippsMalloc_32fc( sizeof( Ipp32fc ) * iVecSize );

// Initialize Vector of Real Values
iIppState = ::ippsSet_32f( ( Ipp32f )2.0f, pfVecRe, iVecSize );
// Initialize Vector of Imaginary Values
iIppState = ::ippsSet_32f( ( Ipp32f )3.0f, pfVecIm, iVecSize );

// Create a Complex Vector
iIppState = ::ippsRealToCplx_32f( pfVecRe, pfVecIm, pfcVecCx, iVecSize );
//iIppState = ::ippsRealToCplx_32f( pfVecRe, RTnull, pfcVecCx, iVecSize );
//iIppState = ::ippsRealToCplx_32f( RTnull, pfVecIm, pfcVecCx, iVecSize );

RTint iNumOfIterations = 0;

CrtPrintf( RTU("Process & Thread Priority:\n") );

//CrtPrintf( RTU("IDLE\n") );
//::SetPriorityClass( ::GetCurrentProcess(), IDLE_PRIORITY_CLASS );
//CrtPrintf( RTU("NORMAL\n") );
//::SetPriorityClass( ::GetCurrentProcess(), NORMAL_PRIORITY_CLASS );
//CrtPrintf( RTU("HIGH\n") );
//::SetPriorityClass( ::GetCurrentProcess(), HIGH_PRIORITY_CLASS );
CrtPrintf( RTU("REAL TIME\n") );
::SetPriorityClass( ::GetCurrentProcess(), REALTIME_PRIORITY_CLASS );

//CrtPrintf( RTU("TIME CRITICAL\n") );
//::SetThreadPriority( ::GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL );

CrtPrintf( RTU("Processing Started\n") );

g_uiTicksStart = SysGetTickCount();
while( RTtrue )
{
// Calculate the Magnitude of the Complex Vector ( Mag = sqrt( src.re^2 + src.im^2 ) )
//iIppState = ::ippsMagnitude_32fc( pfcVecCx, pfVecMg, iVecSize );
iIppState = ::ippsMagnitude_32f( pfVecRe, pfVecIm, pfVecMg, iVecSize );
if( iIppState != 0 )
break;

//::Sleep( 0 );
//::Sleep( 1 );
//::Sleep( 10 );

//if( iNumOfIterations++ == NUMBER_OF_TESTS_0001048576 )
if( iNumOfIterations++ == NUMBER_OF_TESTS_0004194304 )
//if( iNumOfIterations++ == NUMBER_OF_TESTS_0016777216 )
break;

//CrtPrintf( RTU("Iterations Done: %ld\r"), ( RTint )iNumOfIterations );
}
CrtPrintf( RTU("\nSuccessfully Completed Processing in %ld ticks\n"), ( RTint )( SysGetTickCount() - g_uiTicksStart ) );

::SetPriorityClass( ::GetCurrentProcess(), NORMAL_PRIORITY_CLASS );

if( pfVecRe != RTnull )
::ippsFree( pfVecRe );
if( pfVecIm != RTnull )
::ippsFree( pfVecIm );
if( pfVecMg != RTnull )
::ippsFree( pfVecMg );
if( pfcVecCx != RTnull )
::ippsFree( pfcVecCx );

pfVecRe = RTnull;
pfVecIm = RTnull;
pfVecMg = RTnull;
pfcVecCx = RTnull;
...

0 Kudos
Reply