Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
- Unexpected Results of 3D Not Inplace FFT

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Andy_K_

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-24-2016
10:55 AM

45 Views

The following is my C code snipped illustrating a problem.

I set dim1/dim2/dim3 to all be the next higher power of 2 based on the lengths

of the actual input data. My test that works has

all dims set to 1024 (not by design, it just happened that way). I have

the data padded with 0s in all locations for the pad.

The routine works correctly for the above case. I compare the values

created in the DftiComputeBackward call to the input data, and the differences

are at 10 to the minus 8.

However, when I increase just the dim2 amount to 2048, the output is virtually identical,

but the phase is 180 degrees off, i.e., the sign of the values are flipped.

The same holds true when I increase just dim3 to 2048, identical output except for

the signs.

Again, I am looking at the 'traceout' array created in DftiComputeBackward.

Thanks for any suggestions;

void doit( int dim2, int dim1, int dim3, float *tracein, float *traceout, complex *in_fft_c) {

float back_scale;

MKL_LONG istat;

MKL_LONG size[3];

MKL_LONG rs[4], cs[4];

DFTI_DESCRIPTOR_HANDLE dfti;

/* set sizes (includes pad) */

size[0] = dim1;

size[1] = dim2;

size[2] = dim3;

/* set scale for inverse fft */

back_scale = 1.0/(dim1*dim2*dim3);

/* set strides for out of place real and complex transforms */

rs[3] = 1;

rs[2] = dim3;

rs[1] = dim2*dim3;

rs[0] = 0;

cs[3] = 1;

cs[2] = dim3/2+1;

cs[1] = dim2*(dim3/2+1);

cs[0] = 0;

istat = DftiCreateDescriptor( &dfti, DFTI_SINGLE, DFTI_REAL, 3, size );

istat += DftiSetValue( dfti, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX );

istat += DftiSetValue( dfti, DFTI_PLACEMENT, DFTI_NOT_INPLACE );

istat += DftiSetValue( dfti, DFTI_OUTPUT_STRIDES , cs );

istat += DftiCommitDescriptor( dfti );

istat += DftiComputeForward ( dfti, tracein , in_fft_c );

istat += DftiSetValue( dfti, DFTI_INPUT_STRIDES , cs );

istat += DftiSetValue( dfti, DFTI_OUTPUT_STRIDES , rs );

istat += DftiSetValue (dfti, DFTI_BACKWARD_SCALE, back_scale);

istat += DftiCommitDescriptor( dfti );

istat += DftiComputeBackward ( dfti, in_fft_c, traceout );

istat += DftiFreeDescriptor(&dfti);

}

Link Copied

Accepted Solutions

Andy_K_

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-29-2016
01:11 PM

45 Views

I figured this out on my own.

The code snippet I posted was essentially correct. The problem was unexpectedly in the following line;

back_scale = 1.0/(dim1*dim2*dim3);

Since dim1, dim2, and dim3 are ints, the divisor was also computed as a 32 bit signed int and it overflowed (1024*1024*2048), and back_scale became a minus 1.

The input and output strides were correct, and the input strides did not need specified for the compute_forward.

5 Replies

Andy_K_

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-24-2016
11:10 AM

45 Views

**not** flipped.

Evgueni_P_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-24-2016
11:12 AM

45 Views

For the forward transform, the following line is missing:

istat += DftiSetValue( dfti, DFTI_INPUT_STRIDES , rs );

Unless you want to transpose data implicitly, the strides are in the reversed order -- please use the following:

rs[3] = dim1*dim2;

rs[2] = dim1;

rs[1] = 1;

rs[0] = 0;

cs[3] = dim2*(dim1/2+1);

cs[2] = dim1/2+1;

cs[1] = 1;

cs[0] = 0;

Andy_K_

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-28-2016
06:08 AM

45 Views

That didn't work. Perhaps I have my data ordered incorrectly, and the strides are more complicated.

From my original post, the 'tracein' array is composed of three logical entities. Let's call them A,B,C using the previously noted variables dim1, dim2, dim3 for the lengths. The memory storage is

(offset) (C array element)

0 = A[0],B[0],C[0]

1 = A[0],B[0],C[1]

2 = A[0],B[0],C[2]

...

dim3-1 = A[0],B[0],C[dim3-1]

dim3 = A[0],B[1],C[0]

dim3+1 = A[0],B[1],C[1]

and so on. The A[1],B[0],C[0] element would occur at memory offset dim2*dim3

I usually can figure this out by looking at examples, but I can't seem to find one that fits my case, especially for NOT_INPLACE transforms.

Someone posted a question on 2d FFT which is I think similar to my question.

Once again, thanks in advance.

Andy_K_

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-29-2016
01:11 PM

46 Views

I figured this out on my own.

The code snippet I posted was essentially correct. The problem was unexpectedly in the following line;

back_scale = 1.0/(dim1*dim2*dim3);

Since dim1, dim2, and dim3 are ints, the divisor was also computed as a 32 bit signed int and it overflowed (1024*1024*2048), and back_scale became a minus 1.

The input and output strides were correct, and the input strides did not need specified for the compute_forward.

Evgueni_P_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-29-2016
10:00 PM

45 Views

Thank you very much for root-causing this issue.

For more complete information about compiler optimizations, see our Optimization Notice.