Anyone have a suggestion for what SIMD instructions are best for this on an AVX chip?
I've got two 4-element vectors, A and B,of packed doubles.
For corresponding elements of A and B, I want to know whether or not A is numerically less than B.
But then, I want to know if any of those four comparisons had an answer of "true". So essentially I want to to a logical "and" across the register containing the results of those four comparisons. Is there an efficient way to do this?
I'm new to SIMD programming, but what you wrote looks promising. So is the basic idea as follows?
The "vcmppd" operation will populate every bit in the destination register with a 1 or 0, and that includes the sign bit. Then "vmovmskpd" gathers the sign bits from all two or four packed elements, which is how we get the results of all four comparisons into a single scalara register. Then we just test that register for all zeroes?
Also, do you happen do know if I can use Intel C++ intrinsics to pull off what you've written? I'm already deviating from simple C++ by using intrinsics. If possible I'd like to avoid another non-C++ construct: assembly.
Thanks again for your help.