- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts:
I want to do multiple FFT and I want to do them in parallel. So my code is similar as the following:
ippsFFTGetSize_C_32fc(....)
ippsFFTInit_C_32fc(...FFTSpec, Buffer)
parallel_for(0, chunks, [=](size_t i){
ippsFFTFwd_CToC_32fc(...FFTSpec, Buffer);
}
But I found that the results are not correct. I suspect that the FFTspec and Buffer record the status when do fft operation, so there is conflict when I do parallel FFTs.
Could you please let me know the real reason?
And is there any way I can parallel multiple FFTs? (I do not want to put ippsFFTInit_C_32fc in the loop as it is time-consuming)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
in that case for elimination threads oversubscription, you can call ippSetNumThreads(1).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have already disabled the internal IPP OpenMP threading.
It does not make sense to me, if I use FFTSpec
1. The program might not know how large the size is until it do parallel ippsFFTFwd_CToC_32fc, that is, the size is dynamic and could not be known beforehand.
2. The number of threads created is limited by the number of cores, and for each thread it do mulitple ippsFFTFwd_CToC_32fc in serial. So it indicates that we only need to create the maximal size of array equal to number of cores. And each thread has its own FFTSpec and Buffer. But how could I control that?
You comments are highly appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If all FFTs have the same order - it is enough to have 1 FFTSpec - in IPP terminology (described in the manual) Spec is always const, while State (for example FIRs, IIRs) stores temporal function state in order to provide stream processing. So for correct threading you should create one common FFTSpec and a number of unique buffers - one for each thread. Buffers are used for temporal store after each butterfly, while Spec contains only pre-calculated twiddle factors and bit-reverse table.
regards, Igor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your information.
Igor, for one FFT order, we just need to create 1 FFTSpec, that's good. And I still need to create a number of buffers - one for each thread. But the problem is that I do not want to create number of chunks of buffers which I do not know beforehand. If I just need to create number of chunks = number of CPU cores = number of threads, that's will be great. Do you know how do that with Intel TBB tools?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page