Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL 2025 vdSqr AVX/SSE4_2

rgr21
Beginner
421 Views

Summary:

v?Sqr in MKL 2025, when operating in place (input==output) do not work with SSE4_2. The vector is incorrectly left unchanged.

 

Repro:

#include <stdio.h>
#include <mkl.h>

int main() {
    int n = 4;
    double input[] = {0.0, 1.0, 2.0, 3.0};
    double output[n];

    // This works fine
    vdSqr(n, input, output);
    for (int i = 0; i < n; i++) {
        if (output[i] != i * i) {
            printf("Output incorrect at %d=%f\n", i, output[i]);
        }
        if (input[i] != i) {
            printf("Input incorrect at %d=%f\n", i, input[i]);
        }
    }

    // This is broken with MKL_ENABLE_INSTRUCTIONS=SSE4_2, input is left unchanged
    // (MKL_ENABLE_INSTRUCTIONS=AVX falls back to SSE4_2, so broken there too)
    vdSqr(n, input, input);
    for (int i = 0; i < n; i++) {
        if (input[i] != i*i) {
            printf("Input incorrect at %d=%f\n", i, input[i]);
        }
    }

    return 0;
}

 

$ gcc -I/opt/intel/oneapi/mkl/2025.0/include test.c -L/opt/intel/oneapi/mkl/2025.0/lib/intel64 -lmkl_rt -lpthread -lm -ldl -o test
$ export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/2025.0/lib:/opt/intel/oneapi/compiler/2025.0/lib
$ MKL_ENABLE_INSTRUCTIONS=AVX2 ./test
$ MKL_ENABLE_INSTRUCTIONS=AVX ./test
Intel oneMKL WARNING: Support of Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library will use Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions instead.
Input incorrect at 2=2.000000
Input incorrect at 3=3.000000
$ MKL_ENABLE_INSTRUCTIONS=SSE4_2 ./test
Input incorrect at 2=2.000000
Input incorrect at 3=3.000000
$
$
$ export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/2024.0/lib:/opt/intel/oneapi/compiler/2024.0/lib
$ gcc -I/opt/intel/oneapi/mkl/2024.0/include test.c -L/opt/intel/oneapi/mkl/2024.0/lib/intel64 -lmkl_rt -lpthread -lm -ldl -o test
$ MKL_ENABLE_INSTRUCTIONS=AVX2 ./test
$ MKL_ENABLE_INSTRUCTIONS=AVX ./test
Intel MKL WARNING: Support of Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library will use Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions instead.
$ MKL_ENABLE_INSTRUCTIONS=SSE4_2 ./test
$

Intel(R) Xeon(R) Gold 6242R CPU @ 3.10GHz. I wouldn't think it matters, but Ubuntu 22.04 pulling from your upstream.

 

 

 

Docs here indicate in place operation is supported: https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2025-0/vector-mathematical-functions.html

"All the VM mathematical functions can perform in-place operations, where the input and output arrays are at the same memory locations. "

 

I am not especially interested in SSE4_2, but was upgrading from a way older version of MKL and some older code used AVX for testing as a lowest common denominator. That then got deprecated. Rather than mkl_cbwr_set(MKL_CBWR_AVX) returning an error (as the equivalent call does when running on cpus from other manufacturers for example), it prints to stdout and falls back to SSE4_2, which then exposes this bug.

 

A very non-thorough scan indicates that v?Sqr are wrong, but at least v?Sqrt, v?Pow work as I expect.

0 Kudos
2 Replies
Shiquan_Su
Moderator
219 Views

Thanks for reporting this issue, we are looking into it. Can you test your example with our compiler icx compared to GCC?

0 Kudos
rgr21
Beginner
196 Views

This gives the same result.

$ icx --version
Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2025.0/bin/compiler
Configuration file: /opt/intel/oneapi/compiler/2025.0/bin/compiler/../icx.cfg

 

(This makes sense to me -- the choice of instruction set is runtime rather than compile time)

0 Kudos
Reply