Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

IPP function freezes system

i am using the ipp to calculate a fft.
The fft-process runs 3 times parallel with priority 16 (REALTIME_PRIORITY_CLASS) an uses the ipp-functions.
A part of the calculation is the ippsMagnitude_32fc(). I replacd the magnitude calculation with a step-by-step calculation with a ippsqr, ippadd and ippsqrt function.

In case of calling only the ippMagnitude_32fc() my system freezes (even processes with priority 22).
Systempower rises up to 100%.

In the step-by-step case - no process freezesand the fft-calculation only need 10% (20ms) of the time it needs with the ippmagnitude-function - but my calculation-process needs more time at other parts of the programm - it seems the process has to be in a "waiting" mode. Systempower is only at max. 50 %.

Can anyone explain the reason of the function-timing/priority - or how to optmimize in this case !?

some data:
sysetem: winXP
ipp-includes: ipps, ippcore, ippvm

please provide additional data (or better - reproducible example): ia32 or x64, library version (6.1, 7.0.x), static (threaded or not) or dynamic linking, library letter (w7, v8, etc.), FFT order, vector length for Magnitude.

i tried to make a small version example of the magnitude calculation of the fft.
The sample programm creates a complex sample vektor - and calculates the magnitude of it.
If you run this code -with priority 16 - it blocks other processes with higher priority - but itself is nocht blocket - it keeps running with10-12 ms. So is it possible that the ipp-functions uses their "own" priority ?
The default thread priority should by "normal" !

void main(int argc, char *argv[], char *env[] )


Ipp32f *pReal;

Ipp32f *pMagn;

Ipp32fc *pCplx;

int ippstate = 0;

// Allocate memory for the arrays...
pReal = (Ipp32f *)ippMalloc(sizeof(Ipp32f) * len);

pMagn = (Ipp32f *)ippMalloc(sizeof(Ipp32f) * len);

pCplx = (Ipp32fc *)ippMalloc(sizeof(Ipp32fc) * len);

// Set a const. value to the real vector...
ippstate = ippsSet_32f( (Ipp32f)1.234, pReal, len );

// Create just a simple complex vector ( 1.234, 0.000 )...
ippstate = ippsRealToCplx_32f( pReal, NULL, pCplx, len );

while( 1 ){

// Calculate the magnitude of the complex vector...
ippstate = ippsMagnitude_32fc( pCplx, pMagn, len );


ippFree( pReal );
ippFree( pMagn );
ippFree( pCplx );


FFT-Settings: ippsFFTInitAlloc_C_32fc(&pFFTSet, order /*12*/, IPP_FFT_NODIV_BY_ANY, ippAlgHintNone);

additional data:
using ia32
lib version: (e.g. ipps)
vector length max. 2048
fft order 12

You said nothing about linking model. Guess you are using dynamic or threaded static linking and I think that all problems with prioryty settings are because of OMP that is used for threaded version of library. My proposal is to link with with non-threaded static library - it will work properly with any priority.

So you mean - creating a single-threaded programm to calculate the fft - with static linked ipp libraries ?
(for information i am using visual studio 2008, current settings are multithreaded)

All my words above are about IPP libraries only. I'm suggesting you to use static linking of non-threaded static IPP libraries. ippsMagnitude_32fc function for dynamic and threaded-static libraries is threaded with OMP - to my understanding it may be the only reason of hanging-up when running with non-default priority.

Hi Joachim,

I hope that it is not too late and here are some results of my investigation:

First of all it wasn't clear from your initial post how you were changing a process priority?
That is, manually using the 'Task Manager' or programmatically using Win32 API functions
'SetPriorityClass' or 'SetThreadPriority'.

>>...In case of calling only the ippMagnitude_32fc() my system freezes...

In your test-case inside a 'while( 1 )' loop there is a call to Win32 API function 'Sleep' with a
delay of 10ms. Did you try to comment it? If No, please try it. I can tell you that I reproduced a
'Complete-System-Freeze' when 'Sleep( 10 )' is called after the priority of the process was changed
to Real-Time. I had to press on a Reset button to restart a computer with Windows XP.

There are No any problems with 'ippMagnitude_32fc' IPP API function. This is a very old function,
I estimate it is at least ~12 years old, and it is hard to believe it has some problems.

Since you've changed a priority of the process to Real-Time a call 'Sleep( 10 )' creates some problems.

Do you think the Real-Time process will give a chance to another processes to be executed?

>>...In the step-by-step case - no process freezes...

I could guess that you've done everything correctly.

>> calculation-process needs more time at other parts of the programm...

Let's take look at performance numbers from my tests ( a screenshoot is enclosed as well ):

Note: Number of iterations for all tests is 4,194,304;

Real-Time priority - 48.359 secs ( fastest )
High priority - 48.765 secs
Normal priority - 49.250 secs
Idle priority - 49.438 secs ( slowest )

If we calculate performance increase numbers it is clear that:

Real-Time process is ~2.18% faster than Idle process, and
Real-Time process is ~1.81% faster than Normal process.

Simply take into account these numbers.

>> it possible that the ipp-functions uses their "own" priority?

I don't know.

>>...Can anyone explain the reason of the function-timing/priority - or how to
>>optmimize in this case?..

Please take a look at MSDN because I simply don't want to repeat what Microsoft's developers
already described. Subjects, key words, function names, etc, are as follows:

Platform SDK DLLs, Processes, and Threads
Scheduling Priorities
Priority Boosts
Win32 API functions 'Sleep', 'SleepEx','SetThreadPriority' and 'SetPriorityClass'

Best regards,

1. Performance of two IPP functions for calculating the Magnitude is as follows:

ippsMagnitude_32fc(...) is faster (in ~1.6 times! )than ippsMagnitude_32f(...)

2. Performance statistics for different IPP's DLLs:

ippsw7.dll - 48,328 secs
ippsa6.dll - 48,750 secs
ippsm6.dll - 125,796 secs
ippspx.dll - 206,891 secs

3. You could improve a system responsiveness by calling:

::SetThreadPriority( ::GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL );

instead of:

::SetPriorityClass( ::GetCurrentProcess(), REALTIME_PRIORITY_CLASS );

My Test-Case is based on the initial Joachim'sTest-Casefrom one of the first posts andis provided AS IS.
So, you'll need to modify it if interested. Good luck!


RTint iIppState = 0;
Ipp32f *pfVecRe = RTnull;
Ipp32f *pfVecIm = RTnull;
Ipp32f *pfVecMg = RTnull;
Ipp32fc *pfcVecCx = RTnull;
RTint iVecSize = 2048;

//int iSizeIpp32f = sizeof( Ipp32f ); // 4 bytes
//int iSizeIpp32fc = sizeof( Ipp32fc );// 8 bytes

// Allocate memory for the Vectors
pfVecRe = ( Ipp32f * )::ippsMalloc_32f( sizeof( Ipp32f ) * iVecSize );
pfVecIm = ( Ipp32f * )::ippsMalloc_32f( sizeof( Ipp32f ) * iVecSize );
pfVecMg = ( Ipp32f * )::ippsMalloc_32f( sizeof( Ipp32f ) * iVecSize );
pfcVecCx = ( Ipp32fc * )::ippsMalloc_32fc( sizeof( Ipp32fc ) * iVecSize );

// Initialize Vector of Real Values
iIppState = ::ippsSet_32f( ( Ipp32f )2.0f, pfVecRe, iVecSize );
// Initialize Vector of Imaginary Values
iIppState = ::ippsSet_32f( ( Ipp32f )3.0f, pfVecIm, iVecSize );

// Create a Complex Vector
iIppState = ::ippsRealToCplx_32f( pfVecRe, pfVecIm, pfcVecCx, iVecSize );
//iIppState = ::ippsRealToCplx_32f( pfVecRe, RTnull, pfcVecCx, iVecSize );
//iIppState = ::ippsRealToCplx_32f( RTnull, pfVecIm, pfcVecCx, iVecSize );

RTint iNumOfIterations = 0;

CrtPrintf( RTU("Process & Thread Priority:\n") );

//CrtPrintf( RTU("IDLE\n") );
//::SetPriorityClass( ::GetCurrentProcess(), IDLE_PRIORITY_CLASS );
//CrtPrintf( RTU("NORMAL\n") );
//::SetPriorityClass( ::GetCurrentProcess(), NORMAL_PRIORITY_CLASS );
//CrtPrintf( RTU("HIGH\n") );
//::SetPriorityClass( ::GetCurrentProcess(), HIGH_PRIORITY_CLASS );
CrtPrintf( RTU("REAL TIME\n") );
::SetPriorityClass( ::GetCurrentProcess(), REALTIME_PRIORITY_CLASS );

//CrtPrintf( RTU("TIME CRITICAL\n") );
//::SetThreadPriority( ::GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL );

CrtPrintf( RTU("Processing Started\n") );

g_uiTicksStart = SysGetTickCount();
while( RTtrue )
// Calculate the Magnitude of the Complex Vector ( Mag = sqrt(^2 +^2 ) )
//iIppState = ::ippsMagnitude_32fc( pfcVecCx, pfVecMg, iVecSize );
iIppState = ::ippsMagnitude_32f( pfVecRe, pfVecIm, pfVecMg, iVecSize );
if( iIppState != 0 )

//::Sleep( 0 );
//::Sleep( 1 );
//::Sleep( 10 );

//if( iNumOfIterations++ == NUMBER_OF_TESTS_0001048576 )
if( iNumOfIterations++ == NUMBER_OF_TESTS_0004194304 )
//if( iNumOfIterations++ == NUMBER_OF_TESTS_0016777216 )

//CrtPrintf( RTU("Iterations Done: %ld\r"), ( RTint )iNumOfIterations );
CrtPrintf( RTU("\nSuccessfully Completed Processing in %ld ticks\n"), ( RTint )( SysGetTickCount() - g_uiTicksStart ) );

::SetPriorityClass( ::GetCurrentProcess(), NORMAL_PRIORITY_CLASS );

if( pfVecRe != RTnull )
::ippsFree( pfVecRe );
if( pfVecIm != RTnull )
::ippsFree( pfVecIm );
if( pfVecMg != RTnull )
::ippsFree( pfVecMg );
if( pfcVecCx != RTnull )
::ippsFree( pfcVecCx );

pfVecRe = RTnull;
pfVecIm = RTnull;
pfVecMg = RTnull;
pfcVecCx = RTnull;

