I am using IPP 7.0 to write some code that will perform a series of 2D image filtering operations on images of various sizes (mostly 128 by 128). I have two versions of the code: one that leaves the threading up to IPP (internal to the DFT call, for example), and another in which I handle the threading myself at the application level. I have found that the IPP threading does not provide any significant speedup for the small sized DFT operations (like 128 by 128) and so I really want to use application level threading. When I do this, however, the final images are not correct. If I run the two threads serially, things work fine, but when I run them in parallel the result is wrong. I suspect that I need to provide more information than what I have provided, but I'm not sure what ype of information would be most useful. The IPP functions I am using are as follows:
I tried simplifying the code somewhat by just doing the DFT, then zeroing out the DC component and then doing the InvDFT. I still got the same result (two threads running in parallel give the wrong answer, while two threads running serially give the correct answer). In this simple case, the only IPP functions I use are as follows:
Please let me know what other information I can provide that might help in diagnosing this.
Thanks for your help!