Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*

AVX512 intrinsics get miscompiled.

Mysticial
Beginner
271 Views

Compiler: Intel(R) oneAPI DPC++/C++ Compiler 2025.0.0 (2025.0.0.20241008)

OS: Windows (via Visual Studio 17.11.5)

 

The following code is incorrectly compiled:

 

#include <stdint.h>
#include <immintrin.h>
#include <iostream>
using namespace std;

#define NO_INLINE __declspec(noinline)


NO_INLINE void print(__m512i x){
    const uint64_t* ptr = (const uint64_t*)&x;
    for (size_t c = 0; c < 8; c++){
        cout << ptr[c] << endl;
    }
}

NO_INLINE void test(void* R){
    const char* ptr = (const char*)R;

    __m512i r0, r1, r2;

    r0 = _mm512_castsi256_si512(_mm256_loadu_si256((const __m256i*)(ptr +  32)));
    r0 = _mm512_inserti64x4(r0, _mm256_loadu_si256((const __m256i*)(ptr + 224)), 1);

    r1 = _mm512_castsi256_si512(_mm256_loadu_si256((const __m256i*)(ptr + 128)));
    r1 = _mm512_inserti64x4(r1, _mm256_loadu_si256((const __m256i*)(ptr + 320)), 1);

    r2 = _mm512_mask_permutex_epi64(r1, 0x33, r0, 78);

    print(r2);
}

int main(){
    uint64_t R[48];
    for (size_t c = 0; c < 48; c++){
        R[c] = c;
    }
    test(R);
}

 

Compiler Flags:

 

/GS /W3 /Gy /Zc:wchar_t /Zi /O2 /D "NDEBUG" /D "_CONSOLE" /D "__INTEL_LLVM_COMPILER=20250000" /D "_UNICODE" /D "UNICODE" /Qipo /Zc:forScope /Oi /MD /Fa"x64\ICX2025\" /EHsc /nologo /Fo"x64\ICX2025\" /Fp"x64\ICX2025\ICX2025 Miscompile.pch" /Qxcannonlake 

 

 Expected Output: (MSVC, ICC, and ICX with optimizations disabled all produce this output)

 

6
7
18
19
30
31
42
43

 

Actual Output: (ICX2025 with the above flags)

 

18
19
6
7
42
43
30
31

 

Looking at the assembly, I suspect the compiler is trying to optimize away the mask in the `_mm512_mask_permutex_epi64` intrinsic by folding it into a `vpermi2q` instruction, but it's doing it incorrectly and is swapping the order of the operands.

0 Kudos
1 Reply
Sravani_K_Intel
Moderator
204 Views

Thanks for reporting this bug with the latest 2025.0 compiler. This will be further investigated and escalated to development team. 

0 Kudos
Reply