I use TBB todo filtering with IPPfunction "ippsFIR_32fc", each thead works on portion of data. But the results are quite strange. I can see a lot of glitch (very large values)into the output data.
The code is as following:
parallel_for(tbb::blocked_range
void operator() (const blocked_range
{
Int begin = r.begin();
Int end = r.end();
Int nIters = end - begin;
ippsFIR_32fc(m_inP + begin, m_outP + begin, nIters, m_stateP);
}
If I remove the IPP function "ippsFIR_32fc" with "ippsCopy_32f", the multiple thread copy functionality works fine.
Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. I indeed see complex input data and complex filter coefficients OR real input data and real filter coefficients.
Note: I have already use function 'ippSetNumThreads(1)' to set IPP internal OpenMP threads number to 1.
Could you please help me?
链接已复制
Also, if you're using IPP with TBB, I'd recommend linking with the unthreaded static libs instead of the DLLs. I don't think there's a way to completely disable OpenMP when linking with the DLLs, and this is why you need to call ippSetNumThreads(1).
Peter
I don't see anything wrong with the ipp call. It looks correct and you are using a version of the API that can be multi-threaded (any of the ippsFIR API's withSrcDst parameters cannot be multi-threaded). But what is thattilesize in the parallel for? inv.Length/1.5?Try takingthe default tile size and report back with the results.
...
Could you please help me?
Hi, I could look at the problem and here aretwo questions:
Could you post a small test-case?
What is your TBB version?
Best regards,
Sergey
Hello,
You may find the "state structures" in IPP here: "state structures that are modified during operation":
http://software.intel.com/sites/products/documentation/hpc/ipp/ippi/ippi_ch2/ch2_function_context_structures.html
so, each threading should has its own status structures. From the code you post here, it looks the "m_stateP" is shared by multiple tasks, which may create incorrect result.
Thanks,
chao
Hi Chao, It means that an array of filter states has to be used instead, like:
...
IppsFIRState_32f *pState[
...
Since forevery TBBthreadyou can geta ThreadID, or asimilar uniqueID,it is possible to map the pState variables to
a proper processing thread.
Best regards,
Sergey
