Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

problem with IPP threaded FFT

Bo_Fang
Beginner
784 Views
Hi all,
Based on my experiment with threaded FFT and previous related threads in the forum, I observerd that threaded FFT is not working with len of FFT larger than 2^19 or more than two cores.
My question is, do you have plan for this matter in next version? I am using IPP 7.0 for Linux.
Thank you
flyree
0 Kudos
8 Replies
SergeyKostrov
Valued Contributor II
784 Views
Quoting Bo Fang
...I observerd that threaded FFT is not working with len of FFT larger than 2^19 or more than two cores.
My question is, do you have plan for this matter in next version? I am using IPP 7.0 for Linux...


It could a memory relatedissue if you're on a 32-bit Linux (or ona 32-bit Windows ). Could you, for example
in case ofa 32-bit Linux,allocate a memory block larger than 1.5GB?

0 Kudos
Bo_Fang
Beginner
784 Views
Hi Sergey,
Thank you for your reply. Actually I am on a 64-bit Linux machine.
flyree
0 Kudos
Ying_H_Intel
Employee
784 Views
Hi Flyree,

Could you pleasetell the detials about which FFT function you are called on what kind of OS and hardware?

Yes, there is lenght limitation on FFT, for example, there are limitation (around 2^27). The value was actually mainly based on the application's memory limitation (The OS allowed the total memory which one application can used, generally 2G).
I recalled there are some discussions in forums. here is one of them:
<<http://software.intel.com/en-us/articles/mkl-ipp-choosing-an-fft/>>
<<http://software.intel.com/en-us/forums/showthread.php?t=75734&o=a&s=lr>>.

But regarding the threading limitation (to 2 threads), I didn't recall it. Could you please show me one.

Best Regards,
Ying
0 Kudos
Bo_Fang
Beginner
784 Views
Hi Ying,

Here are two links pointing out length and core limitation.

http://software.intel.com/en-us/forums/showthread.php?t=73301
In my case, I invokeippsFFTFwd_RToPack_32f andippsFFTInv_PackToR_32f, other than initilzation and cleanup functions related to these two in my code. My experiment is on Linux ( Fedora 14, 64 bit ) with 24 processors. The length of FFT is 24. I agree that the memory limitation seems to be the major reason. If I change the length smaller, I can see some performance improvement. But I didn't observe any difference among different different thread numbers like 2, 4 or 8...
Thank you
flyree
0 Kudos
SergeyKostrov
Valued Contributor II
784 Views
...Yes, there is lenght limitation on FFT, for example, there are limitation (around 2^27). The value was
actually mainly based on the application's memory limitation (The OS allowed the total memory which one
application can used, generally 2G)...

I also remember that there was a similar discussion a while ago. But, 2GB limitation is applicable only for 32-bit platforms.
64-bit platforms allow to allocate very large blocks of memory ( Terrabytes )andIPP shouldn't have memory constraints
in that case.

Best regards,
Sergey
0 Kudos
SergeyKostrov
Valued Contributor II
784 Views


I admit these two threads are describing the same issues. There was another thread (more recent)with a short
discussion about some memory constraints of IPP library.

Best regards,
Sergey

0 Kudos
Ying_H_Intel
Employee
784 Views
Hi Sergey, flyree,

Thanks for the clarify. So according to those discussion, it is true thereare memory constraints and threaded constraints in 1D FFT.

Regarding flyree's question, you have plan for this matter in next version?
No, there is no plan for this matter in next IPP version. One of thereason is as the explanation ofU73301 and U82974. Another reason is that as there are more and more multi-threading methods used by our users, we plan to remove the internal threading of IPP functionsso that user canusesuitable multi-threadsaccording to their requirements.
And if you'd like use FFT with large length and more threads, i may recommend another library-Math Kernal library, http://software.intel.com/en-us/articles/intel-mkl/. Which will for the larger problem and multi-cores ready (and cluster ready)

Best Regards,
Ying
0 Kudos
SergeyKostrov
Valued Contributor II
784 Views
...we plan to remove the internal threading of IPP functionsso that user canusesuitable multi-threadsaccording to their requirements...

This is absolutelywise decision and I hope that more time will be spent on making IPP more reliableand better. OnlyPrimitives will need to be
testedinstead of Primitives and Multi-threading.

Best regards,
Sergey
0 Kudos
Reply