Trouble Getting Started with FFT

Richard_Binet · ‎11-23-2010

Hi everybody.

I have just started to try and use IPP to replace a coded C FFT library that I inherited many years ago. In the past I used Intel SPL with success, but I am having trouble withIPP.

The FFT calling function that I have used in the past is as follows.

[cpp]int	i; 
if( dom != FREQ ){
#ifndef INTEL_SPL
	for( i=0; i < npts; i++ ){
		pt.re = pt.re/npts;
		pt.im = pt.im/npts;
	} // for
	fft( );
#else
	int order = round(log((double)(npts))/log((double)2));
	 nspcFft((struct _SCplx * )pt,order,NSP_Inv);
#endif	
	dom = FREQ;
} // if
[/cpp]

The following is the code I have tried to use IPP, but the FFT call does not funtion correctly

[cpp]void dsp::fft_freq(  )
{
	// temp do fft time.
	int i;
	if( dom != FREQ )
	{
		if (using_ipp) 
		{
			// Do the Intel Performance Library Version
			int order =(log((double)(npts))/log((double)2));
			the_log->debug("Npts is %i, computed order is %i, check 2^%i = %i",npts,order,order,int(pow((double)2,(double)order)));

			IppStatus is;
			IppsFFTSpec_C_32fc * pSpec;    

			is = ippsFFTInitAlloc_C_32fc(&pSpec, order , IPP_FFT_DIV_INV_BY_N , ippAlgHintAccurate);

			is = ippsFFTFwd_CToC_32fc_I( (Ipp32fc*)(pt), pSpec, 0 );
//			is = ippsFFTInv_CToC_32fc_I( (Ipp32fc*)(pt), pSpec, 0 ); test, make code do nothing Doesnt work!

			// Cleanup
			ippsFFTFree_C_32fc(pSpec); 
		}
		else
		{
			for( i=0; i < npts; i++ ){
				pt.re = pt.re/npts;
				pt.im = pt.im/npts;
			} // for
			fft( );
		}
	} /// if dom != FREQ
    dom = FREQ;



[/cpp]

Could anyone provide some help. I'm wondering about the pt array in my class and how this maps to an equivilant array that IPP can cope with.

The my DSP library is based on an old c_comp class that is used to represent complex floats:

[bash]class c_comp {				// Complex type

public:
	float re,im;
	c_comp( float _re, float _im = 0.0 );
	c_comp();
	
	// etc other functions

}; // c_comp
[/bash]

In the DSP class the waveform is stored as a dynamically allocted array of c_comp pointed to by the variable pt

The array of floats as stored in memory is thus

Re[0],Im[0],Re[1],Im[1],...Re[npts-1],Im[npts-1].

Is this compatable with IPP and what is the equivilant type. Is it 32fc as I have used above?
Do I have to be careful about how this array is allocated with respect to byte alignment?

Any help would be greatly appreciated.
Richard.

igorastakhov · ‎11-24-2010

Hi Richard,

It is not clear from your post what is wrong - could you be more specific?
Just comparing NSP and IPP code I see that you don't use "round" for log/log while determining "order" for IPP FFT - so an order for IPP FFT can be less than expected.
Also you used INV transform for NSP while Fwd case is uncommented for IPP.
Ipp32fc is declared in ippdefs.h:

typedef struct {
Ipp32f re;
Ipp32f im;
} Ipp32fc;
so it's just the same structure as you expect.
You should care about src/dst byte alignment from the performance point of view only - to have the best performance you should align you arrays on 16 byte boundaries, but alignment doesn't affect function correctness - you'll have correct result with any alignment.

regards,
Igor

Richard_Binet · ‎11-24-2010

Hi Igor,

Thanks for your reply.

You are quite correct that my post was not clear. I am going slightly crazy trying to determine what is wrong. I was mainly trying to rule out thatmy array structure can be pointed to with a * to IPP32fc. Thanks for clearing that up.

Your comments regarding the order code I have generated is also valid, I have been trying to rule out that I have incorrectly determined the order.

Running the code at present shows npts = 65536, order is 16 and the check 2^16 = 65536, is this correct? I cannot seem to find a definition of what order is in the IPP documentation.

My real question is that it appears that the FFT function is not functioning correcly.I thought would be a good test I called a FWD FFT followed by an inverse FFT immediately after each other, which should not effect the data waveform at all. When this code runs, the waveform is badly effected.

is = ippsFFTFwd_CToC_32fc_I( (Ipp32fc*)(pt), pSpec, 0 );

is = ippsFFTInv_CToC_32fc_I( (Ipp32fc*)(pt), pSpec, 0 );

I am wondering what I am doing wrong. I think I have correctly intialised everything, including the library which I have called with the following code.
ippStaticInit();

Could anyone confirm that I have adequately inititalised the FFT functions with the call to ippsFFTInitAlloc_C_32fc

Thanks,
Richard

Richard_Binet · ‎11-25-2010

Hi

Here is something interesting that I have discovered.

If I do not call ippStaticInit and the intel librariesuse the c optimised version as the help suggested, the code executes correctly.

Any ideas anyone?

(I am trying to compile a win32 application on Win7 running X64 OS.)

Thanks,
Richard.

Vladimir_Dudnik · ‎11-25-2010

Hi Richard,

what you are saying is that IPP FFT functions does not work correctly. We do not see that on our testing. Could you please provide test case to reproduce your situation?

Regards,
Vladimir

Richard_Binet · ‎11-25-2010

Hi Vladimir,

I have found that if I can get correct operation from the library if I use the following in the IPP initialisation routines.

[bash]ippStaticInit();
ippSetNumThreads (1);

[/bash]

Given that I can use the library correctly, I think this problem is now related to some multithreading issue.

Unfortunately my DSP library is utilised across mulitple programs and requires the use of multithreading. I was under the impression that the test program I was using did not have mulitple threads, but for compatability with other programs is linked against the Visual C multithreaded libraries. It is likely that I was wrong and will need to spend a couple of days trying to find if someother threads are being started in the application.

I will report back what I find. Thanks for your help Vladimir and Igor.

Vladimir_Dudnik · ‎11-25-2010

By the way, what is IPP version do you use and what is your target OS/cpu? We will try to reproduce your issue on our side. What are actual parameters you apply to FFT functions (for both initialization and compute)?

Vladimir

Richard_Binet · ‎11-29-2010

Vladimir,

I am unsure what information you are interested in specifcially, however I have re-written my code for the inverse FFT which will hopefully make it clearer. I have looked through my code, and I am not running any other multithreaded elements of code at present, so I am pretty sure it should work fine.

I am linking against the static librarys so I thought I did not need to specify a target CPU, it this correct? Can you elaborate further?

On my development machine I am running Windows 7 pro X64 build.
the complier I am using is Visual Studio 2010 pro.
The target build is X86, (however I am running on and X64 machine. Eventually I plan to move to X64 but I need to get a few things in place first.)
The CPU is reported in windows as Intel Code2 Extreme CPU Q9300 @ 2.53GHz 4 Cores, 4 logical processors.

Thanks for any help you can provide.

Richard.

[cpp]	// Temporary Code
	ippStaticInit();
	int threads = 0;
	if (threads >0)
	{
		ippSetNumThreads (threads);
		the_log->text("Ipp Library Has Been Initialised with %i threads",threads );
	} 
	else
		the_log->text("Ipp Library Has Been Initialised without setting number of threads" );

	const IppLibraryVersion*
	lib = ippsGetLibVersion();       
	the_log->text(" Version %s %s %d.%d.%d.%d",
			lib->Name, lib->Version,            lib->major,
			lib->minor, lib->majorBuild, lib->build);
	int order =(log((double)(npts))/log((double)2));
	the_log->debug("Npts is %i, computed order is %i, check 2^%i = %i",npts,order,order,int(pow((double)2,(double)order)));

	IppStatus is;
	IppsFFTSpec_C_32fc * pSpec;    
	is = ippsFFTInitAlloc_C_32fc(&pSpec, order , IPP_FFT_DIV_INV_BY_N , ippAlgHintAccurate);
	is = ippsFFTFwd_CToC_32fc_I( (Ipp32fc*)(pt), pSpec, 0 );
	ippsFFTFree_C_32fc(pSpec); 
	dom = TIME;
	return;
[/cpp]

I have attached the debug output from three runs of this code with threads being initialised to 0,1 and 2 respectively and recompiled. The only value of threads which works correctly is when threads is initialised to 1.

Output of program with threadsinitialised to values 0,1 and 2.

Without setting number of threads. Code fails to execute correctly.
-------------------------------------------------------------------
Ipp Library Has Been Initialised without setting number of threads
Version ippsp8_t.lib 7.0 build 205.23 7.0.205.1024
Npts is 65536, computed order is 16, check 2^16 = 65536

Setting number of threads to 1. Code executes correctly.
-------------------------------------------------------------------
Ipp Library Has Been Initialised with 1 threads
Version ippsp8_t.lib 7.0 build 205.23 7.0.205.1024
Npts is 65536, computed order is 16, check 2^16 = 65536

Setting number of threads to 2. Code fails to execute correctly.
-------------------------------------------------------------------
Ipp Library Has Been Initialised with 2 threads
Version ippsp8_t.lib 7.0 build 205.23 7.0.205.1024
Npts is 65536, computed order is 16, check 2^16 = 65536

Ying_H_Intel · ‎12-02-2010

Hi Richard,

Interesting to see the result. As I understand,apply FFT Forward, then inverse, you can' t get back exact ( original pt) value ,right?

What is exacterror betweenthe original Pt andthevaluefrom INV?
orcan you attach the code thathowyouevaluateit exectute correctly or not so we can reproduce the problem?
(for example, there is small similiarFFT test caseat http://software.intel.com/en-us/forums/showthread.php?t=68506, where the problemisreal/complex array store format)

Regards,
Ying