Some AVX512F intrinsic functions use __mmask8 masks, but 8-bit mask instructions like KMOVB and KANDB require AVX512DQ. How can I make sure the compiler uses 16-bit mask instructions (KMOVW, KANDW) and not 8-bit mask instructions when I have AVX512F but not AVX512DQ? Is it safe to write __mmask8, or is this the way to do it?:
__mmask16 a, b, c; __m512i x, y, z; ... a = b & c; // use KANDW, not KANDB z = _mm512_mask_mov_epi64(x, (__mmask8)a, y);
I may write _mm512_kand or _kand_mask16 to hopefully make this explicit, but the KMOV is implicit anyway and I want to avoid that a compiler uses KMOV8 if I write __mmask8.
Note that I am writing generic code and I want the code to work with all compilers, both Intel, Microsoft, Gnu and Clang. But since Intel have defined the instruction set and the intrinsics, I suppose that this is the authoritative place to ask.
- Development Tools
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Parallel Computing