3 Replies

But given that you're already a proficient SSE2 programmer I suspect you've already figured out how to re-arrange your date using the most efficient means.

I've tried to rewrite at least parts of my code to avoid having to convert the data, but it turns out that the fact that the left and right channels are interleaved is also benefitting the performance. So I gain some performance by not having to convert the data, but I in some cases I loose more due to the less optimal data storage than I gain from not having to convert the data.

(How is that possible? In some cases I have to combine the output of many different FFT's, and when the left and right channels are stored separately I have to read and write at twice as many different locations. So basically the number of reads and writes doubles, and the number of cache misses probably also increases).

Fortunately, I did find something useful: If I take an FFT of interleaved data, and I want to do something with amplitudes at different frequencies (for example bandpass), equal on both channels, I only have to mirror the behavior for the top end - so at 2^N-x I do the same thing that I do at x. This did help me to increase the performance for certain very simple filters.

Now what I was actually looking for - but I guess I should ask at a maths forum or something - is a more generic solution for other types of filtering as well.

