I know that at some point an FFT-based convolution would be faster, but I don't need to go so much above 64 & I don't wanna enter the complexity of an FFT-based one.
Using IPP 6.x btw.
Are you run the test on multi-cores machines? could you tell the cpu usage and a rough performance data in two cases?
It is true thatippsConv_32fchange thealgorithm from direct convolution to FFT-based convolutionat some points. for example
if(( lenDst < 512)||( MIN(Src1Len,Src2Len) <64 ))
because the FFT-based convolution would be faster when thelength data islarge enough. But as you see, the "critical point" isavalue byempirical test. Around the critical point, the performance advantagemay be wobbles.
Do you alwayscaculate the convolution of the source buffer is large and the second one keep 64 sowant to use direct algorithm?
just for your reference, the Direct Convolution is supported by MKL(IPP sister library) also.
For example, y=x(*)h, MKL call is like
status = vslsConvExec1D(task, h,inch, x,incx, y,incy);
For more information, please see <<>>>
However I can also measurethe huge jump in CPU usage even when there's 1 single voice processed (meaning, no multithreading), by a factor of almost 5 (& it's pretty stable, not random spikes at all).
But: my bad, I had only tested within the compiler (Delphi) so far. Testing outside debugger, it's actually pretty good!
I know that Delphi's debugger does something lengthywhen threads are created (already noticed this in the past), so I'm assuming that threads are being created when the jump occurs.
So, false alarm, sry.