I would have not thought this possible! Are there tools or compiler functionality that do this SSE code conversion automatically? I'll be thrown if there are.
As far as tools for auto-vectorizing existing compiled code, that's a research topic.
The VC++ 2008 compiler has an option to enhance code to use SSE automatically yet this will only give me at most 10% speedup. Is this an acceptable speedup for the automated vectorization? It seems a bit dull to me.The manual method using intrinsics gives me over a 3 times speedup for those same sections of code.
Has anyone else had better experience with automated vectorizationusing theIntel or gcc compilers ? Could anyone post some numbers of automation performance improvements with for example an array of single floats.
Somewhat surprising to me, effectiveness of auto-vectorization vs. intrinsics is still a very hot topic.