Question about strides in real-to-complex FFT

Sidney_Cadot · ‎09-17-2010

Dear all,

I'm having trouble getting strides properly working using DFTI, in particular when doing multiple real-to-complex FFTs, and trying to transpose the FFT values in the output.

Given that this is my matrix X (stored row-major, I'm using C++):

X: 101.000 102.320 102.483 101.152
X: 202.320 206.381 205.759 201.671
X: 302.483 305.759 306.163 302.859

The forward FFT's of each of the three rows are these (they are expected and correct):

FX: ( 406.954 0.000) ( -1.483 -1.168) ( 0.011 0.000)
FX: ( 816.132 0.000) ( -3.439 -4.710) ( 0.026 0.000)
FX: ( 1217.264 0.000) ( -3.681 -2.900) ( 0.028 0.000)

However, when I try to have MKL transpose the result within the FFT call, I cannot get this working. The best I can get is this:

FXT: ( 406.954 0.000) ( 816.132 0.000) ( 1217.264 0.000)
FXT: ( -1.483 0.000) ( -3.439 -1.168) ( -3.681 -4.710)
FXT: ( 0.011 -2.900) ( 0.026 0.000) ( 0.028 0.000)

The first row is correct; furtermore, the real values of the second and third rows are also correct. However, the imaginary values of the second and third row are wrong; apparantly, they have moved one column to the right!

In all probability I am doing something wrong; I would be very grateful if anyone can help to point out my mistake.

I have included my code that generates the output shown above below.

Any help would be greatly appreciated.

Best regards, Sidney Cadot

#include
#include
#include

#include

int main()
{
const unsigned NH = 3;
const unsigned NS = 4;
const unsigned NF = NS / 2 + 1;

float X[NH * NS];
std::complex FX [NH * NF];
std::complex FXT[NF * NH];

for (unsigned int i = 0; i < NH; ++i)
{
for (unsigned j = 0; j < NS; ++j)
{
X[i * NS + j] = 100 * (i + 1) + sin(i * j / double(NS) * 2.0 * M_PI ) + exp(sin(i) + sin(j));
}
}

for (unsigned int i = 0; i < NH; ++i)
{
printf("X:");
for (unsigned j = 0; j < NS; ++j)
{
printf(" %10.3f", X[i * NS + j]);
}
printf("n");
}

printf("n");

// Do FFT

{
MKL_LONG status;

// create

DFTI_DESCRIPTOR_HANDLE fft;

status = DftiCreateDescriptor(&fft, DFTI_SINGLE, DFTI_REAL, 1, NS);
assert(status == 0);

// configure

status = DftiSetValue(fft, DFTI_NUMBER_OF_TRANSFORMS, NH);
assert(status == 0);

status = DftiSetValue(fft, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
assert(status == 0);

status = DftiSetValue(fft, DFTI_INPUT_DISTANCE, NS);
assert(status == 0);

status = DftiSetValue(fft, DFTI_OUTPUT_DISTANCE, 2 * NF);
assert(status == 0);

MKL_LONG output_strides[] = {0, 1};

status = DftiSetValue(fft, DFTI_OUTPUT_STRIDES, output_strides);
assert(status == 0);

status = DftiCommitDescriptor(fft);
assert(status == 0);

// execute

status = DftiComputeForward(fft, X, FX);
assert(status == 0);

// dispose

status = DftiFreeDescriptor(&fft);
assert(status == 0);
}

for (unsigned int i = 0; i < NH; ++i)
{
printf("FX:");
for (unsigned j = 0; j < NF; ++j)
{
printf(" (%10.3f %10.3f)", std::real(FX[i * NF + j]), std::imag(FX[i * NF + j]));
}
printf("n");
}

printf("n");

// Do FFT with implicit transpose of output data

{
MKL_LONG status;

// create

DFTI_DESCRIPTOR_HANDLE fft;

status = DftiCreateDescriptor(&fft, DFTI_SINGLE, DFTI_REAL, 1, NS); // OK
assert(status == 0);

// configure

status = DftiSetValue(fft, DFTI_NUMBER_OF_TRANSFORMS, NH); // OK
assert(status == 0);

status = DftiSetValue(fft, DFTI_PLACEMENT, DFTI_NOT_INPLACE); // ok
assert(status == 0);

status = DftiSetValue(fft, DFTI_INPUT_DISTANCE, NS); // OK
assert(status == 0);

status = DftiSetValue(fft, DFTI_OUTPUT_DISTANCE, 2); // OK
assert(status == 0);

MKL_LONG output_strides[] = {0, NH};

status = DftiSetValue(fft, DFTI_OUTPUT_STRIDES, output_strides);
assert(status == 0);

status = DftiCommitDescriptor(fft);
assert(status == 0);

// execute

status = DftiComputeForward(fft, X, FXT);
assert(status == 0);

// dispose

status = DftiFreeDescriptor(&fft);
assert(status == 0);
}

// Done

for (unsigned int i = 0; i < NF; ++i)
{
printf("FXT:");
for (unsigned j = 0; j < NH; ++j)
{
printf(" (%10.3f %10.3f)", std::real(FXT[i * NH + j]), std::imag(FXT[i * NH + j]));
}
printf("n");
}

return 0;
}

Dmitry_B_Intel · ‎09-17-2010

Hi,

Looks like you expect CCE layout of the result, but in fact you've got screwed 2D CCS layout.
If you add DftiSetValue(fft,DFTI_CONJUGATE_EVEN_STORAGE,DFTI_COMPLEX_COMPLEX) after descriptor is created, everything should work as expected.

Thanks
Dima

Sidney_Cadot · ‎09-17-2010

This does work (at least, when I also set DFTI_OUTPUT_DISTANCE to 1) -- thanks. I am not quite sure why though, the section describing the data layout options is hard to comprehend.

Thanks,

Sidney