Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

FFT - Multi Thread

Simon_B_
Novice
706 Views

Hi,

I am feeling very stupid asking this question :)

Without using the IPP multi-thread libraries, how would I perform a multi-thread FFT, for example 32fc Complex to Complex? I'm running very large FFTs and have my FFT thread at almost 100%.

0 Kudos
6 Replies
Igor_A_Intel
Employee
706 Views

Hi Simon,

could you clarify on which domain FFT do you use - 2d (ippi) or 1d (ipps)? Techniques of threading above IPP libraries are different for 1D and 2D. If your concern is about IPP functions thread safety - all of them are thread-safe. If you are running very large FFTs - it's better for you to use MKL library - it uses IPP FFTs internally but provides additional optimization for large FFT orders and threading is not deprecated in MKL.

regards, Igor

0 Kudos
Simon_B_
Novice
706 Views

Hi Igor,

I did ask in ignorance :)

So far I haven't used the multithreaded IPP libraries, and I see they will soon be removed. My only bottleneck is massive complex FFT using ippsFFTFwd_CToC_32fc. If I understand some of the IPP documentation correctly I should handle multi-threading myself so I can assign two, four or more cores to the FFT.

The question is: how would I use two, four or more cores for FFT using IPP?

I could always have many background threads and distribute the work, but I'm trying to run in near real-time and would prefer to use just the one thread.

0 Kudos
Gennady_F_Intel
Moderator
706 Views

you can manage the number of threads by calling ippSetNumThreads(nThreads) function

0 Kudos
Simon_B_
Novice
706 Views

Gennady Fedorov (Intel) wrote:

you can manage the number of threads by calling ippSetNumThreads(nThreads) function

Hi,

From my documentation this sets the number of threads in the multithreading environment, but the multi-threaded FFT functions are going to be removed, currently they are depreciated.

So if I do not use the multithreading functions, what are my options?

0 Kudos
Igor_A_Intel
Employee
706 Views

Hi Simon,

ippsFFTs (1D) are threaded internally only for a limited number of configurations: if CPU has only 2 HW threads sharing the same cache and there is no any hyperthreading (for example Core2Duo) and for FFT orders 12-17 only. For all other cases 1D FFTs work in single-threaded mode. MKL provides threaded solution for FFTs of greater orders - starting from ~19-20 and higher. To execute ippsFFT in several threads you should spend some efforts on dividing your order on several smaller FFTs, run them in parallel in different threads and then perform the final butterfly (radix2 if 2 threads, radix4 - if 4, etc.) yourself - so it is not so easy task. It is better to use existing solution from MKL library. ippSetNumThreads doesn't have any influence on ippsFFTs (except case nThreads=1) as internal threading is hardcoded for conditions I've mentioned above.

regards, Igor.

0 Kudos
Simon_B_
Novice
706 Views

Igor Astakhov (Intel) wrote:

Hi Simon,

ippsFFTs (1D) are threaded internally only for a limited number of configurations: if CPU has only 2 HW threads sharing the same cache and there is no any hyperthreading (for example Core2Duo) and for FFT orders 12-17 only. For all other cases 1D FFTs work in single-threaded mode. MKL provides threaded solution for FFTs of greater orders - starting from ~19-20 and higher. To execute ippsFFT in several threads you should spend some efforts on dividing your order on several smaller FFTs, run them in parallel in different threads and then perform the final butterfly (radix2 if 2 threads, radix4 - if 4, etc.) yourself - so it is not so easy task. It is better to use existing solution from MKL library. ippSetNumThreads doesn't have any influence on ippsFFTs (except case nThreads=1) as internal threading is hardcoded for conditions I've mentioned above.

regards, Igor.

Igor,

Thanks, now I know exactly what I must do. I will either use MKL or CUDA, but don't really want to enforce the use of CUDA at the moment. If the user has CUDA then I can use it, but where there's no CUDA and spare cycles I'll use MKL.

Thanks for IPP - great product.

0 Kudos
Reply