Re: MKL in OneAPI 2024.1

EMagee · ‎06-05-2024

I have run into an issue with the MKL library from OneAPI 2024.1.0 on a couple Linux platforms. I am getting incorrect results from one of the function (vdTGamma, specifically). If I compare the results from this function between using OneAPI 2023.2.0 and 2024.1.0 I get different results. I have verified the results are correct from 2023.2.0. I have attached an CPP file (with a CMake.txt) to demonstrate. To build the example:

> export MKLROOT=<location of onapi 2023.2>/mkl/latest
> export COMPILERLIBDIR=<location of onapi 2023.2>/compiler/latest
> mkdir build
> cd build
> cmake ..
> make
> ./vdTGamma
vdTGamma test/example program

Argument  vdTGamma   expected
==================================
-1.3499   +2.9314    +2.9314
+3.0349   +2.0660    +2.0660
+0.7254   +1.2596    +1.2596
-0.0631   16.4914    16.4914
+0.7147   +1.2754    +1.2754
-0.2050   5.7068     5.7068
-0.1241   8.7741     8.7741
+1.4897   +0.8859    +0.8859
+1.4090   +0.8868    +0.8868
+1.4172   +0.8865    +0.8865
Error! Maximum error is -0.0000

Now using 2024:

> export MKLROOT=<location of onapi 2024.1>/mkl/latest
> export COMPILERLIBDIR=<location of onapi 2024.1>/compiler/latest
> rm -rf *
> cmake ..
> make
> ./vdTGamma
vdTGamma test/example program
Argument  vdTGamma   expected
==============================
-1.3499   +2.6115    +2.9314
+3.0349   +2.0660    +2.0660
+0.7254   +1.2596    +1.2596
-0.0631   inf        16.4914
+0.7147   +1.2754    +1.2754
-0.2050   inf        5.7068
-0.1241   inf        8.7741
+1.4897   +0.8859    +0.8859
+1.4090   +0.8868    +0.8868
+1.4172   +0.8865    +0.8865
Error! Maximum error is   -inf

Has anyone else seen this? My caode, and CMake file are attrached.

Gennady_F_Intel · ‎06-06-2024

What kind of CPU you are running your code here?

I see no problem here while building against oneMKL 2024.1 ( the current one) and 2023.2 and running on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz.

I slightly updated your example by adding mkl_get_version(*.*) routine to print info about version of MKL used:

Compiling: icpx -std=c++11 -qmkl vdTGammaTest.cp

Results obtained:

MKL 2023 u2:

Argument vdTGamma expected
========================
-1.3499 +2.9314 +2.9314
+3.0349 +2.0660 +2.0660
+0.7254 +1.2596 +1.2596
-0.0631 -16.4914 -16.4914
+0.7147 +1.2754 +1.2754
-0.2050 -5.7068 -5.7068
-0.1241 -8.7741 -8.7741
+1.4897 +0.8859 +0.8859
+1.4090 +0.8868 +0.8868
+1.4172 +0.8865 +0.8865

Error! Maximum error is -0.0000

************************************
MKL version: 2023.0.2, CPU: Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors
************************************

MKL 2024 u1:

Argument vdTGamma expected
===============================================================================
-1.3499 +2.9314 +2.9314
+3.0349 +2.0660 +2.0660
+0.7254 +1.2596 +1.2596
-0.0631 -16.4914 -16.4914
+0.7147 +1.2754 +1.2754
-0.2050 -5.7068 -5.7068
-0.1241 -8.7741 -8.7741
+1.4897 +0.8859 +0.8859
+1.4090 +0.8868 +0.8868
+1.4172 +0.8865 +0.8865

Error! Maximum error is -0.0000

****************************************************************
MKL version: 2024.0.1, CPU: Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors
****************************************************************

--Gennady

EMagee · ‎06-06-2024

Here are the outputs from two different systems. Let me know if you need additional information. System 1 is an HPE Cray EX with the AMD 7H12 Rome processor and the SLES OS. System 2 is a Penguin Computing TrueHPC system with the AMD 7713 Milan and the RHEL OS. NOTE: I had to add pthread, m, and dl to the link line for system 2.

System 1 with 2023, output from CMake:

-- The C compiler identification is GNU 10.3.0
-- The CXX compiler identification is GNU 10.3.0
-- Check for working C compiler: /opt/cray/pe/craype/2.7.19/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.7.19/bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.7.19/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.7.19/bin/CC - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
vdTGamma: Using MKL in /opt/intel/oneapi_2023.2.0.49397/mkl/latest with ...
   MKL_LIB_DIR: /opt/intel/oneapi_2023.2.0.49397/mkl/latest/lib/intel64 ...
   COMPILER_LIB_DIR: /opt/intel/oneapi_2023.2.0.49397/compiler/latest/linux/compiler/lib/intel64_lin ...

vdTGamma test/example program

Major version:          2023.0.2
Build:                  20230613
Platform:               Intel(R) 64 architecture
Processor optimization: Intel(R) Architecture processors

Argument   vdTGamma     expected
======================================
-1.3499   +2.9314     +2.9314
+3.0349   +2.0660     +2.0660
+0.7254   +1.2596     +1.2596
-0.0631   -16.4914     -16.4914
+0.7147   +1.2754     +1.2754
-0.2050   -5.7068     -5.7068
-0.1241   -8.7741     -8.7741
+1.4897   +0.8859     +0.8859
+1.4090   +0.8868     +0.8868
+1.4172   +0.8865     +0.8865

Error! Maximum error is -0.0000

System 1 with 2024, output from CMake:

-- The C compiler identification is GNU 10.3.0
-- The CXX compiler identification is GNU 10.3.0
-- Check for working C compiler: /opt/cray/pe/craype/2.7.19/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.7.19/bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.7.19/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.7.19/bin/CC - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
vdTGamma: Using MKL in /opt/intel/oneapi_2024.1.0.596/mkl/latest with ...
   MKL_LIB_DIR: /opt/intel/oneapi_2024.1.0.596/mkl/latest/lib/intel64 ...
   COMPILER_LIB_DIR: /opt/intel/oneapi_2024.1.0.596/compiler/latest/lib ...

vdTGamma test/example program

Major version:          2024.0.1
Build:                  20240215
Platform:               Intel(R) 64 architecture
Processor optimization: Intel(R) Architecture processors

Argument   vdTGamma     expected
======================================
-1.3499   +2.6115     +2.9314
+3.0349   +2.0660     +2.0660
+0.7254   +1.2596     +1.2596
-0.0631     -inf     -16.4914
+0.7147   +1.2754     +1.2754
-0.2050     -inf     -5.7068
-0.1241     -inf     -8.7741
+1.4897   +0.8859     +0.8859
+1.4090   +0.8868     +0.8868
+1.4172   +0.8865     +0.8865

Error! Maximum error is   -inf

System 2 with 2023, CMake output:

-- The C compiler identification is GNU 8.5.0
-- The CXX compiler identification is GNU 8.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
vdTGamma: Using MKL in /p/app/compilers/intel/oneapi-2023.1.0/mkl/latest with ...
   MKL_LIB_DIR: /p/app/compilers/intel/oneapi-2023.1.0/mkl/latest/lib/intel64 ...
   COMPILER_LIB_DIR: /p/app/compilers/intel/oneapi-2023.1.0/compiler/latest/linux/compiler/lib/intel64_lin ...

vdTGamma test/example program

Major version:          2023.0.1
Build:                  20230303
Platform:               Intel(R) 64 architecture
Processor optimization: Intel(R) Architecture processors

Argument   vdTGamma     expected
========================================
-1.3499   +2.9314     +2.9314
+3.0349   +2.0660     +2.0660
+0.7254   +1.2596     +1.2596
-0.0631   -16.4914     -16.4914
+0.7147   +1.2754     +1.2754
-0.2050   -5.7068     -5.7068
-0.1241   -8.7741     -8.7741
+1.4897   +0.8859     +0.8859
+1.4090   +0.8868     +0.8868
+1.4172   +0.8865     +0.8865

Error! Maximum error is -0.0000

System 2 with 2024, CMake output:

-- The C compiler identification is GNU 8.5.0
-- The CXX compiler identification is GNU 8.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
vdTGamma: Using MKL in /p/app/compilers/intel/oneapi-2024.1.0/mkl/latest with ...
   MKL_LIB_DIR: /p/app/compilers/intel/oneapi-2024.1.0/mkl/latest/lib/intel64 ...
   COMPILER_LIB_DIR: /p/app/compilers/intel/oneapi-2024.1.0/compiler/latest/lib ...

vdTGamma test/example program

Major version:          2024.0.1
Build:                  20240215
Platform:               Intel(R) 64 architecture
Processor optimization: Intel(R) Architecture processors

Argument   vdTGamma     expected
======================================
-1.3499   +2.6115     +2.9314
+3.0349   +2.0660     +2.0660
+0.7254   +1.2596     +1.2596
-0.0631     -inf     -16.4914
+0.7147   +1.2754     +1.2754
-0.2050     -inf     -5.7068
-0.1241     -inf     -8.7741
+1.4897   +0.8859     +0.8859
+1.4090   +0.8868     +0.8868
+1.4172   +0.8865     +0.8865

Error! Maximum error is   -inf

EMagee · ‎06-06-2024

Side note: just out of curiosity, I tried vsTGamma with 2024 and it worked fine.

EMagee · ‎06-06-2024

Another side note. I was able to test this on an additional machine using 2024 and it works fine. System 3 is an HPE Cray EX4000 with the AMD 9654 Genoa processor and the SLES 15 OS. Here is the output:

vdTGamma test/example program

Major version:          2024.0.0
Build:                  20231011
Platform:               Intel(R) 64 architecture
Processor optimization: Intel(R) Architecture processors

Argument  vdTGamma   std::tgamma
==================================
-1.3499   +2.9314     +2.9314
+3.0349   +2.0660     +2.0660
+0.7254   +1.2596     +1.2596
-0.0631   -16.4914     -16.4914
+0.7147   +1.2754     +1.2754
-0.2050   -5.7068     -5.7068
-0.1241   -8.7741     -8.7741
+1.4897   +0.8859     +0.8859
+1.4090   +0.8868     +0.8868
+1.4172   +0.8865     +0.8865

Error! Maximum error is 0.0000

u

Gennady_F_Intel · ‎06-07-2024

yes, I have to reproduced the behavior while running this case on Genoa CPU. We will root-caused the problem and keep this threads updated.

--Gennady

Gennady_F_Intel · ‎06-26-2024

Erik,

yesterday the last version of oneMKL ( 2024.2) has been released and available for download. The problem You reported has been fixed.

You could take 2024.2 and evaluate this fix.

here is what I see running this example on Genoa CPU.

vdTGamma test/example program

Argument vdTGamma expected
===============================================================================
-1.3499 +2.9314 +2.9314
+3.0349 +2.0660 +2.0660
+0.7254 +1.2596 +1.2596
-0.0631 -16.4914 -16.4914
+0.7147 +1.2754 +1.2754
-0.2050 -5.7068 -5.7068
-0.1241 -8.7741 -8.7741
+1.4897 +0.8859 +0.8859
+1.4090 +0.8868 +0.8868
+1.4172 +0.8865 +0.8865

Error! Maximum error is -0.0000

****************************************************************
MKL version: 2024.0.2, CPU: Intel(R) Architecture processors
****************************************************************

I also attached the all needed details about CPU, OS, linking and obtained results - see _2024u2_Genoa.txt .