I'm looking for a way to (reasonably efficiently) move between SIMD type arrays and standard data types (int, long, char, etc...) Before I really sat down and thought about what was going on I tried using "memcpy":
#include <immintrin.h> #include <stdio.h> #include <string.h> class T { public : T() = default; void dump() {unsigned int *ptr = (unsigned int *)&vec; for (int i=0; i<8; i++) printf("%08X ", *(ptr+i); printf("\n");} protected : __m256i vec; }; int main(int argc, char *argv[]) { unsigned int x[24]; for (int i=0; i<24; i++) x = i; T y[3]; memcpy(y, x, 24*4); for (int i=0; i<24; i++) printf("%08X%c", x, ((i%8)==7)?'\n':' '); for (int i=0; i<3; i++) y.dump; }
(Sorry for the simplistic example: I had to retype it so hopefully it's right).
I've done this with more complicated templates and different types (__m512i, long, etc...), and it seems to work but I can't find anything C++/standards-wise that says it always will. The closest I've come is the use of memcpy on trivally-copyable types, but that only seems to apply when the types are the same.
I'd appreciate some education on why this should/shouldn't work, and if not, if there's a variation that's better.
I understand that the Intel SDLT might be an alternative, but in this particular case, I'm not allowed to pursue that.
Link Copied
When the class produces POD (plain old data) you should be safe.
However, you also must assure that an array of arbitrary class T does not insert pads. This is not the case with your example above.
You can insert an assert to assure the packing and size is correct
assert(&y[1]-&y[0] == sizeof(__m256i)); // whatever your assumption is
Expand the assert during debug build or during a diagnostic release build.
Jim Dempsey
Craig,
As a follow-up, add the assert as indicate in #2...
... then add a virtual member function to T.
The assert should trigger an error.
Jim Dempsey
For more complete information about compiler optimizations, see our Optimization Notice.