Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Parallelizing 2D FFT

Rizandi_P_
Beginner
329 Views

I'm trying to parallelize 2D FFT on a square NxN matrix. My serial 1-D FFT code is:

void FFT::fft( Complex* primal, Complex* dual, UINT stride, UINT indexOffset )
{
	const int absP = logf(N)/logf(2.0f);

	// Bottom level of iteration tree
	for (int i = 0; i < N; i++)
		c = primal[reverseIndex(i)*stride + indexOffset];

	// There are absP levels above the bottom
	for (int p = 1; p <= absP; p++) 
	{
		// Complex root of unity
		const int unityStep = 0x1 << p;
		const double theta = 2.0f * PI / unityStep; // INVERSE
		const Complex unityRoot(cos(theta), sin(theta));

		// Each higher level doubles the step size
		for (int offset = 0; offset < N; offset += unityStep) 
		{
			Complex omega = Complex(1.0f,0.0f);

			// Combine within a step segment (note only iterate over half step)
			for (int k = 0; k < unityStep/2; k++) 
			{
				const Complex u = c[offset + k];

				const Complex t = omega * c[offset + k + unityStep/2];
				omega *= unityRoot;

				c[offset + k] = u + t;
				c[offset + k + unityStep/2] = u - t;
			}
		}
	}

	for (int j = 0; j < N; j++)
		dual[j*stride + indexOffset] = c;
}

Which I call on every row, then on every collumn to get the 2D fourier transform. Since this operation is independent for each collumn & row, this should be easy to do in parallel. The fft operation uses a single global buffer c for temporary values, I just need M buffers for M threads to do their data juggling independently so I can

// fft on every row
	tbb::parallel_for( 0, N, 1, [&](int i)
	{
		fft->fft(matrix0, matrix0, 1, i * N);
	});
	// fft on every collumn
	tbb::parallel_for( 0, N, 1, [&](int j)
	{
		fft->fft(matrix0, matrix0, N, j);
	});

but I don't know how to get some sort of "thread index" so each thread can index its own temporary buffer.

 

0 Kudos
1 Reply
Vladimir_P_1234567890
329 Views
0 Kudos
Reply