Solved: Storage in memory of CCE format for 3d fft in C

saleslim_cz · ‎02-24-2009

Hi,
can you please explain me how is the data stored in fourier domain for inplace 3D FFT in C? I understand that the only available format is CCE, which is not completely clearly explained in the reference manual (for 10.1 it is page 2821). I would like to know where should I find the real and complex part of coefficients and how is the data stored in memory.

My code for FFT:
MKL_LONG retval = DFTI_NO_ERROR;
MKL_LONG sizes[3] = {data->NZ, data->NY, data->NX - 2};
MKL_LONG strides_in[4] = {0, data->NXY, data->NX, 1};
MKL_LONG strides_out[4] = {0, data->NXY / 2, data->NX / 2,1};

retval |= DftiCreateDescriptor(&ds1, DFTI_SINGLE, DFTI_REAL, 3, sizes);

double scale, total;
total = sizes[0] * sizes[1] * sizes[2];
scale = 1;
retval |= DftiSetValue(ds1, DFTI_FORWARD_SCALE, scale);
retval |= DftiSetValue(ds1, DFTI_PACKED_FORMAT, DFTI_CCE_FORMAT);
retval |= DftiSetValue(ds1, DFTI_INPUT_STRIDES, strides_in);
retval |= DftiSetValue(ds1, DFTI_OUTPUT_STRIDES, strides_out);
retval |= DftiSetValue(ds1, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX);
retval |= DftiSetValue(ds1, DFTI_REAL_STORAGE, DFTI_REAL_REAL);
retval |= DftiCommitDescriptor(ds1);
if(retval != DFTI_NO_ERROR)
return -1;

retval |= DftiComputeForward( ds1, (p));

retval |= DftiFreeDescriptor(&ds1);
if(retval != DFTI_NO_ERROR)
return -1;

The sizes are for example: NX=514, NY=512, NZ=64, NXY = NX*NY

Thank you,
J.

Dmitry_B_Intel · ‎02-24-2009

Hi J.,

Suppose we have NX*NY*NZ block of real numbers that we want to apply real-to-CCE to. Let's assume that on input, real element X(nx,ny,nz) is located somewhere in array 'double *ds1'. This 'somewhere' can be expressed precisely as
#define X(nx,ny,nz) ds1[strides_in[0] + nx*strides_in[1] + ny*strides_in[2] + nz*strides_in[3]]

After we apply the transform, we'll get the result which is NX*NY*(NZ/2+1) complex numbers. Assume we have them, denoted as Y,stored in another array 'complex *ds2', then we can describe this as:
#define Y(nx,ny,nz) ds2[stride_out[0] + nx*strides_out[1] + ny*strides_out[2] + nz*strides_out[3]].
Please noticedifferent pointer type for the frequency domain: it is complex, not real. Also, nz in the backward domain runs from 0 to NZ/2-1, because CCE only keeps ~half of the frequency domain data (you can restore the remaining data using CCE symmetry).

For out-of-place transform (defined by DFTI_PLACEMENT configuration value) the values of strides_in andstrides_outcan be defined independently. For in-place transform, they should be correlated, to ensure in-placeness. Typically this is attained bymaking surethat &X(i,j,0) == &Y(i,j,0), for all i,j (the rows numbered by the last index are transformed real-to-CCE, that is NZ real numbers to NZ/2+1 complex numbers).

I hope this will help.
Dima

View solution in original post

Dmitry_B_Intel · ‎02-24-2009

Hi J.,

Suppose we have NX*NY*NZ block of real numbers that we want to apply real-to-CCE to. Let's assume that on input, real element X(nx,ny,nz) is located somewhere in array 'double *ds1'. This 'somewhere' can be expressed precisely as
#define X(nx,ny,nz) ds1[strides_in[0] + nx*strides_in[1] + ny*strides_in[2] + nz*strides_in[3]]

After we apply the transform, we'll get the result which is NX*NY*(NZ/2+1) complex numbers. Assume we have them, denoted as Y,stored in another array 'complex *ds2', then we can describe this as:
#define Y(nx,ny,nz) ds2[stride_out[0] + nx*strides_out[1] + ny*strides_out[2] + nz*strides_out[3]].
Please noticedifferent pointer type for the frequency domain: it is complex, not real. Also, nz in the backward domain runs from 0 to NZ/2-1, because CCE only keeps ~half of the frequency domain data (you can restore the remaining data using CCE symmetry).

For out-of-place transform (defined by DFTI_PLACEMENT configuration value) the values of strides_in andstrides_outcan be defined independently. For in-place transform, they should be correlated, to ensure in-placeness. Typically this is attained bymaking surethat &X(i,j,0) == &Y(i,j,0), for all i,j (the rows numbered by the last index are transformed real-to-CCE, that is NZ real numbers to NZ/2+1 complex numbers).

I hope this will help.
Dima

saleslim_cz · ‎02-24-2009

Quoting - Dmitry Baksheev (Intel)

Hi J.,

Suppose we have NX*NY*NZ block of real numbers that we want to apply real-to-CCE to. Let's assume that on input, real element X(nx,ny,nz) is located somewhere in array 'double *ds1'. This 'somewhere' can be expressed precisely as
#define X(nx,ny,nz) ds1[strides_in[0] + nx*strides_in[1] + ny*strides_in[2] + nz*strides_in[3]]

After we apply the transform, we'll get the result which is NX*NY*(NZ/2+1) complex numbers. Assume we have them, denoted as Y,stored in another array 'complex *ds2', then we can describe this as:
#define Y(nx,ny,nz) ds2[stride_out[0] + nx*strides_out[1] + ny*strides_out[2] + nz*strides_out[3]].
Please noticedifferent pointer type for the frequency domain: it is complex, not real. Also, nz in the backward domain runs from 0 to NZ/2-1, because CCE only keeps ~half of the frequency domain data (you can restore the remaining data using CCE symmetry).

For out-of-place transform (defined by DFTI_PLACEMENT configuration value) the values of strides_in andstrides_outcan be defined independently. For in-place transform, they should be correlated, to ensure in-placeness. Typically this is attained bymaking surethat &X(i,j,0) == &Y(i,j,0), for all i,j (the rows numbered by the last index are transformed real-to-CCE, that is NZ real numbers to NZ/2+1 complex numbers).

I hope this will help.
Dima

Thank you very much, that is what I needed to know.
With regards,
Jan