- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Summary:
v?Sqr in MKL 2025, when operating in place (input==output) do not work with SSE4_2. The vector is incorrectly left unchanged.
Repro:
#include <stdio.h>
#include <mkl.h>
int main() {
int n = 4;
double input[] = {0.0, 1.0, 2.0, 3.0};
double output[n];
// This works fine
vdSqr(n, input, output);
for (int i = 0; i < n; i++) {
if (output[i] != i * i) {
printf("Output incorrect at %d=%f\n", i, output[i]);
}
if (input[i] != i) {
printf("Input incorrect at %d=%f\n", i, input[i]);
}
}
// This is broken with MKL_ENABLE_INSTRUCTIONS=SSE4_2, input is left unchanged
// (MKL_ENABLE_INSTRUCTIONS=AVX falls back to SSE4_2, so broken there too)
vdSqr(n, input, input);
for (int i = 0; i < n; i++) {
if (input[i] != i*i) {
printf("Input incorrect at %d=%f\n", i, input[i]);
}
}
return 0;
}
$ gcc -I/opt/intel/oneapi/mkl/2025.0/include test.c -L/opt/intel/oneapi/mkl/2025.0/lib/intel64 -lmkl_rt -lpthread -lm -ldl -o test
$ export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/2025.0/lib:/opt/intel/oneapi/compiler/2025.0/lib
$ MKL_ENABLE_INSTRUCTIONS=AVX2 ./test
$ MKL_ENABLE_INSTRUCTIONS=AVX ./test
Intel oneMKL WARNING: Support of Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library will use Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions instead.
Input incorrect at 2=2.000000
Input incorrect at 3=3.000000
$ MKL_ENABLE_INSTRUCTIONS=SSE4_2 ./test
Input incorrect at 2=2.000000
Input incorrect at 3=3.000000
$
$
$ export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/2024.0/lib:/opt/intel/oneapi/compiler/2024.0/lib
$ gcc -I/opt/intel/oneapi/mkl/2024.0/include test.c -L/opt/intel/oneapi/mkl/2024.0/lib/intel64 -lmkl_rt -lpthread -lm -ldl -o test
$ MKL_ENABLE_INSTRUCTIONS=AVX2 ./test
$ MKL_ENABLE_INSTRUCTIONS=AVX ./test
Intel MKL WARNING: Support of Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library will use Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions instead.
$ MKL_ENABLE_INSTRUCTIONS=SSE4_2 ./test
$
Intel(R) Xeon(R) Gold 6242R CPU @ 3.10GHz. I wouldn't think it matters, but Ubuntu 22.04 pulling from your upstream.
Docs here indicate in place operation is supported: https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2025-0/vector-mathematical-functions.html
"All the VM mathematical functions can perform in-place operations, where the input and output arrays are at the same memory locations. "
I am not especially interested in SSE4_2, but was upgrading from a way older version of MKL and some older code used AVX for testing as a lowest common denominator. That then got deprecated. Rather than mkl_cbwr_set(MKL_CBWR_AVX) returning an error (as the equivalent call does when running on cpus from other manufacturers for example), it prints to stdout and falls back to SSE4_2, which then exposes this bug.
A very non-thorough scan indicates that v?Sqr are wrong, but at least v?Sqrt, v?Pow work as I expect.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for reporting this issue, we are looking into it. Can you test your example with our compiler icx compared to GCC?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This gives the same result.
$ icx --version
Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2025.0/bin/compiler
Configuration file: /opt/intel/oneapi/compiler/2025.0/bin/compiler/../icx.cfg
(This makes sense to me -- the choice of instruction set is runtime rather than compile time)

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page