2412 Discussions

## Parallelizing 2D FFT Beginner
127 Views

I'm trying to parallelize 2D FFT on a square NxN matrix. My serial 1-D FFT code is:

```void FFT::fft( Complex* primal, Complex* dual, UINT stride, UINT indexOffset )
{
const int absP = logf(N)/logf(2.0f);

// Bottom level of iteration tree
for (int i = 0; i < N; i++)
c = primal[reverseIndex(i)*stride + indexOffset];

// There are absP levels above the bottom
for (int p = 1; p <= absP; p++)
{
// Complex root of unity
const int unityStep = 0x1 << p;
const double theta = 2.0f * PI / unityStep; // INVERSE
const Complex unityRoot(cos(theta), sin(theta));

// Each higher level doubles the step size
for (int offset = 0; offset < N; offset += unityStep)
{
Complex omega = Complex(1.0f,0.0f);

// Combine within a step segment (note only iterate over half step)
for (int k = 0; k < unityStep/2; k++)
{
const Complex u = c[offset + k];

const Complex t = omega * c[offset + k + unityStep/2];
omega *= unityRoot;

c[offset + k] = u + t;
c[offset + k + unityStep/2] = u - t;
}
}
}

for (int j = 0; j < N; j++)
dual[j*stride + indexOffset] = c;
}```

Which I call on every row, then on every collumn to get the 2D fourier transform. Since this operation is independent for each collumn & row, this should be easy to do in parallel. The fft operation uses a single global buffer c for temporary values, I just need M buffers for M threads to do their data juggling independently so I can

```// fft on every row
tbb::parallel_for( 0, N, 1, [&](int i)
{
fft->fft(matrix0, matrix0, 1, i * N);
});
// fft on every collumn
tbb::parallel_for( 0, N, 1, [&](int j)
{
fft->fft(matrix0, matrix0, N, j);
});```

but I don't know how to get some sort of "thread index" so each thread can index its own temporary buffer.  