Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- MKL DFT descriptor generation question

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

hello_world

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-06-2013
10:05 AM

84 Views

MKL DFT descriptor generation question

Hi there,

I have a question about the DFTI descriptor.

So the problem is 1Kx1K complex number, row major. for each row of 1K element, I would like to compute size-16 FFT with stride 64. That is - I do not want to compute size -1024 FFT but only size-16 FFT.

For example: these 16- elements are element 0, 64, 128, 192, ... 1008. and another size-16 FFT elements are element 1, 65, 129, ... 1009, etc.

And the same computation is applied on all the 1K rows.

I had a look at the reference manual but am not sure if the descriptor could generate that.

specifically, I don't know arguments like:

1) num_of_transforms 2) stride, 3) dist.

Thanks!

Jing

Link Copied

3 Replies

SergeyKostrov

Valued Contributor II

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-06-2013
06:10 PM

84 Views

Dmitry_B_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-06-2013
06:52 PM

84 Views

Hi Jing,

The following lines should guide you to the desired computation:

[cpp]

MKL_LONG size = 16;

MKL_LONG strides[] = { 0, 64 };

MKL_LONG ntransforms = 64;

DftiCreateDescriptor(&h, ..., 1, size); // = I would like to compute size-16 FFT

DftiSetValue(h, DFTI_INPUT_STRIDES, strides ); // = with stride 64

DftiSetValue(..., DFTI_NUMBER_OF_TRANSFORMS, ntransforms ); // compute 64 ffts of one row

DftiCommitDescriptor(...);

for (rowno=0;rowno<1024;++rowno) DftiComputeForward(h,&data[rowno*rowsize]);

[/cpp]

Thanks

Dima

hello_world

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-06-2013
07:20 PM

84 Views

Hi Dima,

Thanks for your reply - I thought of that - but thought the performance of using for loop would be really bad. I just ran the code according to your guideline and the performance is way worse than 1024*64 number of size-16 FFT if assuming consecutive memory stride. Since the FLOPS are realtively small and I thought the batched execution may be able to exploit the memory and cache pretty good for stride(0, 64) as it is when stride (0, 1) is used.

Do you have any suggestions to tune the performance?

Thanks!!

Jing

Dmitry Baksheev (Intel) wrote:

Hi Jing,

The following lines should guide you to the desired computation:

MKL_LONG size = 16; MKL_LONG strides[] = { 0, 64 }; MKL_LONG ntransforms = 64; DftiCreateDescriptor(&h, ..., 1, size); // = I would like to compute size-16 FFT DftiSetValue(h, DFTI_INPUT_STRIDES, strides ); // = with stride 64 DftiSetValue(..., DFTI_NUMBER_OF_TRANSFORMS, ntransforms ); // compute 64 ffts of one row DftiCommitDescriptor(...); for (rowno=0;rowno<1024;++rowno) DftiComputeForward(h,&data[rowno*rowsize]);

Thanks

Dima

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.