Intel® MKL 2018 Beta Update 1 is now available

Gennady_F_Intel · ‎06-14-2017

Intel® MKL 2018 Beta is now available as part of the Parallel Studio XE 2018 Beta.

Check the Join the Intel® Parallel Studio XE 2018 Beta program post to learn how to join the Beta program, and the provide your feedback.

What's New in Intel® MKL 2018 Beta Update 1:

BLAS:

Addressed an early release buffer issue in threaded *GEMV
Improved TBB *GEMM performance for small m and n while k is large

DNN:

Improved performance on Intel® Xeon Phi™ processor x200 (formerly Knights Landing)
Improved convolution performance for Intel(R) Xeon Phi™ processors based on Intel® Advanced Vector Extensions 512 (Intel® AVX-512) with support of AVX512_4FMAPS and AVX512_4VNNIW instruction groups

FFT:

Improved performance of 3D FFT complex-to-real and real-to-complex scaled and non-scaled problems on Intel® Xeon Phi™ processor 72** (formerly Knights Landing)
Improved performance of 2D FFT complex-to-complex problems with scale on Intel® Xeon Phi™ processor 72** (formerly Knights Landing) and on Intel® Xeon processors E3-xxxx V5 (formerly SkyLake)

LAPACK:

Completed alignment with Netlib* LAPACK 3.7.0 by integrating all new routines and bug fixes. Notable new features are:
- Optimized factorization, solve and inverse routines with rook pivoting: ?sytrf_rk/?hetrf_rk, ?sytrs_rk/?hetrs_rk and ?sytri_3/?hetri_3
- Added LAPACKE interfaces for all new routines including Aasen factorization and solve routines
Improved ?gesvd performance for tall-and-skinny matrices.
Improved ?gelq/?gemlq performance for short-and-wide matrices.

Vector Mathematics:

Improved performance of vdPowx and vsPowx functions for certain exponent values (0,5,6,7,8,9 )

What's New in Intel® MKL 2018 Beta:

DNN:
- Added initial convolution and inner product optimizations for Intel(R) Xeon Phi(TM) processors based on Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of AVX512_4FMAPS and AVX512_4VNNIW instruction groups.
- Average pooling has an option to include padding into mean values computation

BLAS Features:
- Introduced optimized integer matrix-matrix multiplication routines (GEMM_S16S16S16 and GEMM_S16S16S32) to work with quantized matrices for all architectures.
- Introduced ?TRSM_BATCH to complement the batched BLAS for all architectures

BLAS Optimizations:
- Optimized SGEMM, GEMM_S16S16S16 and GEMM_S16S16S32 for Intel(R) Xeon Phi(TM) processors based on Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of AVX512_4FMAPS and AVX512_4VNNIW instruction groups
- Improved ?GEMM_BATCH performance for all architectures
- Improved single and multi-threaded {D,S}SYMV performance for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) and the Intel® Xeon Phi™ processor x200

Sparse BLAS:
- Improved performance of CSRMV/BSRMV functionality for Intel® AVX-512 instruction set in Inspector-Executor mode

LAPACK:
- Introduced factorization and solve routines based on Aasen's algorithm: ?sytrf_aa/?hetrf_aa, ?sytrs_aa/?hetrs_aa

Vector Mathematics:
- Added 24 new functions: v?Fmod, v?Remainder, v?Powr, v?Exp2; v?Exp10; v?Log2; v?Logb; v?Cospi; v?Sinpi; v?Tanpi; v?Acospi; v?Asinpi; v?Atanpi; v?Atan2pi; v?Cosd; v?Sind; v?Tand; v?CopySign; v?NextAfter; v?Fdim; v?Fmax; v?Fmin; v?MaxMag; v?MinMag

Library Engineering:
- Introduced support for Intel(R) Xeon Phi(TM) processors based on Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of AVX512_4FMAPS and AVX512_4VNNIW instruction groups.

Optimizations are not dispatched unless explicitly enabled with mkl_enable_instructions function call or MKL_ENABLE_INSTRUCTIONS environment variable.

Documentation:
- Starting with this version of Intel MKL, most of the documentation for Parallel Studio XE is only available online at https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation. You can also download it from the Intel Registration Center > Product List > Intel® Parallel Studio XE Documentation Beta.
Hardware Support for Intel® Xeon Phi™ Coprocessors (code name Knights Corner) is removed. Customers are recommended to stay on MKL 2017 given they continue to use and develop for Intel® Xeon Phi™ Coprocessors (aka Knight Corner)