void operator() (const blocked_range
Int begin = r.begin();
Int end = r.end();
Int nIters = end - begin;
ippsFIR_32fc(m_inP + begin, m_outP + begin, nIters, m_stateP);
If I remove the IPP function "ippsFIR_32fc" with "ippsCopy_32f", the multiple thread copy functionality works fine.
Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. I indeed see complex input data and complex filter coefficients OR real input data and real filter coefficients.
Note: I have already use function 'ippSetNumThreads(1)' to set IPP internal OpenMP threads number to 1.
Could you please help me?
I don't see anything wrong with the ipp call. It looks correct and you are using a version of the API that can be multi-threaded (any of the ippsFIR API's withSrcDst parameters cannot be multi-threaded). But what is thattilesize in the parallel for? inv.Length/1.5?Try takingthe default tile size and report back with the results.
You may find the "state structures" in IPP here: "state structures that are modified during operation":
so, each threading should has its own status structures. From the code you post here, it looks the "m_stateP" is shared by multiple tasks, which may create incorrect result.
Hi Chao, It means that an array of filter states has to be used instead, like:
Since forevery TBBthreadyou can geta ThreadID, or asimilar uniqueID,it is possible to map the pState variables to
a proper processing thread.