How can I increase the FFT efficiency of MKL on MIC？

王_子_ · ‎07-12-2016

I want to use fftw3 library on MIC, but I can not find a appropriate method to accelerate the FFT .

I use these codes,but the speed slower than the cpu code .

#pragma offload target(mic:1) in(in:length(nx*ny))\
			      out(out:length(nx*ny))
{
	fftwf_plan temp= fftwf_plan_dft_2d(nx,ny,in, out, FFTW_BACKWARD, FFTW_ESTIMATE);
	fftwf_execute(temp);
}

So I want to ask two questions:

1.the implention of FFT can not use openmp,If I want to use ,what should I do ? Just use "#pragma omp parallel for" can not compile the code.

2.Can the method of fftw make full use of the MIC computation resource? Do I need to try other methods of using FFT?

Thanks !

Evgueni_P_Intel · ‎07-12-2016

Hi 王子,

You may want to consider using Intel MKL which provides a implementation of FFTW interfaces optimized for Intel MIC architecture -- https://software.intel.com/en-us/intel-mkl

Evgueni.

McCalpinJohn · ‎07-12-2016

As I explained in response to your prior posting (https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/658660), threads are not going to help here -- the FFT performance will be limited by the data transfer time to/from the Xeon Phi.

For thread-level parallelization internal to the MKL routines, you can use the MKL_NUM_THREADS environment variable. This allows you independent control of the thread-level parallelism of your code (using OMP_NUM_THREADS) and the thread-level parallelism used by the MKL routines. This should work for the MKL FFTW3 interfaces as well.

McCalpinJohn · ‎07-12-2016

There are also examples of using OpenMP parallelism with the MKL FFT interfaces at

https://software.intel.com/en-us/node/471392#FE027A11-6E44-42DF-8A56-5075E24CA22A