Solved: Unexpected Results of 3D Not Inplace FFT

Andy_K_ · ‎03-24-2016

The following is my C code snipped illustrating a problem.

I set dim1/dim2/dim3 to all be the next higher power of 2 based on the lengths
of the actual input data. My test that works has
all dims set to 1024 (not by design, it just happened that way). I have
the data padded with 0s in all locations for the pad.

The routine works correctly for the above case. I compare the values
created in the DftiComputeBackward call to the input data, and the differences
are at 10 to the minus 8.

However, when I increase just the dim2 amount to 2048, the output is virtually identical,
but the phase is 180 degrees off, i.e., the sign of the values are flipped.

The same holds true when I increase just dim3 to 2048, identical output except for
the signs.

Again, I am looking at the 'traceout' array created in DftiComputeBackward.

Thanks for any suggestions;

void doit( int dim2, int dim1, int dim3, float *tracein, float *traceout, complex *in_fft_c) {

float back_scale;

MKL_LONG istat;
MKL_LONG size[3];
MKL_LONG rs[4], cs[4];

DFTI_DESCRIPTOR_HANDLE dfti;

/* set sizes (includes pad) */
size[0] = dim1;
size[1] = dim2;
size[2] = dim3;

/* set scale for inverse fft */
back_scale = 1.0/(dim1*dim2*dim3);

/* set strides for out of place real and complex transforms */
rs[3] = 1;
rs[2] = dim3;
rs[1] = dim2*dim3;
rs[0] = 0;

cs[3] = 1;
cs[2] = dim3/2+1;
cs[1] = dim2*(dim3/2+1);
cs[0] = 0;

istat = DftiCreateDescriptor( &dfti, DFTI_SINGLE, DFTI_REAL, 3, size );
istat += DftiSetValue( dfti, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX );
istat += DftiSetValue( dfti, DFTI_PLACEMENT, DFTI_NOT_INPLACE );
istat += DftiSetValue( dfti, DFTI_OUTPUT_STRIDES , cs );
istat += DftiCommitDescriptor( dfti );

istat += DftiComputeForward ( dfti, tracein , in_fft_c );

istat += DftiSetValue( dfti, DFTI_INPUT_STRIDES , cs );
istat += DftiSetValue( dfti, DFTI_OUTPUT_STRIDES , rs );
istat += DftiSetValue (dfti, DFTI_BACKWARD_SCALE, back_scale);
istat += DftiCommitDescriptor( dfti );

istat += DftiComputeBackward ( dfti, in_fft_c, traceout );

istat += DftiFreeDescriptor(&dfti);

}

Andy_K_ · ‎03-29-2016

I figured this out on my own.

The code snippet I posted was essentially correct. The problem was unexpectedly in the following line;

back_scale = 1.0/(dim1*dim2*dim3);

Since dim1, dim2, and dim3 are ints, the divisor was also computed as a 32 bit signed int and it overflowed (1024*1024*2048), and back_scale became a minus 1.

The input and output strides were correct, and the input strides did not need specified for the compute_forward.

View solution in original post

Andy_K_ · ‎03-24-2016

Further info, I input a completely different set of data where it was a 256x256x1024 transform, and the output sign was not flipped.

Evgueni_P_Intel · ‎03-24-2016

For the forward transform, the following line is missing:

istat += DftiSetValue( dfti, DFTI_INPUT_STRIDES , rs );

Unless you want to transpose data implicitly, the strides are in the reversed order -- please use the following:

rs[3] = dim1*dim2;
rs[2] = dim1;
rs[1] = 1;
rs[0] = 0;

cs[3] = dim2*(dim1/2+1);
cs[2] = dim1/2+1;
cs[1] = 1;
cs[0] = 0;

Andy_K_ · ‎03-28-2016

That didn't work. Perhaps I have my data ordered incorrectly, and the strides are more complicated.

From my original post, the 'tracein' array is composed of three logical entities. Let's call them A,B,C using the previously noted variables dim1, dim2, dim3 for the lengths. The memory storage is

(offset) (C array element)

0 = A[0],B[0],C[0]

1 = A[0],B[0],C[1]

2 = A[0],B[0],C[2]

...

dim3-1 = A[0],B[0],C[dim3-1]

dim3 = A[0],B[1],C[0]

dim3+1 = A[0],B[1],C[1]

and so on. The A[1],B[0],C[0] element would occur at memory offset dim2*dim3

I usually can figure this out by looking at examples, but I can't seem to find one that fits my case, especially for NOT_INPLACE transforms.

Someone posted a question on 2d FFT which is I think similar to my question.

Once again, thanks in advance.

Andy_K_ · ‎03-29-2016

I figured this out on my own.

The code snippet I posted was essentially correct. The problem was unexpectedly in the following line;

back_scale = 1.0/(dim1*dim2*dim3);

Since dim1, dim2, and dim3 are ints, the divisor was also computed as a 32 bit signed int and it overflowed (1024*1024*2048), and back_scale became a minus 1.

The input and output strides were correct, and the input strides did not need specified for the compute_forward.

Evgueni_P_Intel · ‎03-29-2016

Thank you very much for root-causing this issue.