Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
- DftiSetValue and DFTI_NUMBER_OF_TRANSFORMS

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

John_Kornak

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-23-2010
05:04 PM

50 Views

DftiSetValue and DFTI_NUMBER_OF_TRANSFORMS

Hi MKL experts

I hope someone can help me understand the setting of DFTI_NUMBER_OF_TRANSFORMS

For example if I have a loop that implements an FFT of the same dimensions on each iteration:

StatusSmall = DftiCreateDescriptor(FFTid, DFTI_SINGLE, DFTI_COMPLEX, 2, lgth)

StatusSmall = DftiCommitDescriptor(FFTid)

DO lps=1,10

StatusBig = DftiComputeForward(FFTid, DFTarray)

DFTarray = SomeFunction(DFTarray)

END DO

Do I set DFTI_NUMBER_OF_TRANSFORMS to 1 or 10?

If not 10, then for what kind of situation do you need to set DFTI_NUMBER_OF_TRANSFORMS to a number other than 1?

Thanks

John

6 Replies

Dmitry_B_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-23-2010
09:09 PM

50 Views

With DFTI_NUMBER_OF_TRANSFORMS = N function DftiComputeForward will do N transforms in a single call. The idea can be illustrated on basis of your example as follows:

[fortran]complex :: multple_dftarray(lgth(1),lgth(2),10) ... DftiSetValue(fftid,DFTI_NUMBER_OF_TRANSFORMS,10) DftiSetValue(fftid,DFTI_INPUT_DISTANCE, lgth(1)*lgth(2) ) ... statusbig = DftiComputeForward(fftid, multiple_dftarray) ! does 10 two-d transforms do lps = 1, 10 dft_array = somefunction( multiple_dftarray(:,:,lps) ) enddo [/fortran]

Thanks

Dima

John_Kornak

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-24-2010
09:22 AM

50 Views

Thanks Dima,

One follow up question. Is this multiple dft process more efficient than doing the 10 two-d transforms separately or does it just provide for more compact syntax?

Cheers

John

Dmitry_B_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-26-2010
07:42 AM

50 Views

If you do a bunch of small transforms, where function call overhead comprises a noticeable part of the transform time, doing the bunch in a single call is more efficient. For large transforms doing the transforms separately may be more efficient, because each of the separate transforms will be done in parallel, whereas the bunch of transform will be often parallelized in a transform per thread fashion.

Thanks

Dima

John_Kornak

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-26-2010
02:08 PM

50 Views

That makes a lot of sense :)

Do you have a rough intuition of what would be small enough for the function call overhead to become significant?

Thanks

John

Dmitry_B_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-29-2010
08:13 PM

50 Views

For example, if the data for one transform fits into L1 cache (32Kb) then threading overhead will likely overweight speedup due to parallelization. The data includesprecomputed trigonometric tables used toperfrom the FFT.

Thanks

Dima

John_Kornak

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-30-2010
11:30 AM

50 Views

Thanks again Dima,

I really appreciate your help.

Best

John

For more complete information about compiler optimizations, see our Optimization Notice.