Announcing Intel MKL 2017 release

Gennady_F_Intel — Wed, 07 Sep 2016 08:22:49 GMT

Check out the new and the latest Intel® Math Kernel Library (Intel® MKL) 2017 release!

What's new in Intel MKL 2017:

Introduced optimizations for the Intel® Xeon Phi™ processor x200 (formerly Knights Landing ) self-boot platform for Windows* OS
Enabled Automatic Offload (AO) and Compiler Assisted Offload (CAO) modes for the second generation of Intel Xeon Phi coprocessor on Linux* OS
Introduced Deep Neural Networks (DNN) primitives including convolution, normalization, activation, and pooling functions intended to accelerate convolutional neural networks (CNNs) and deep neural networks on Intel® Architecture.
- Optimized for Intel® Xeon® processor E5-xxxx v3 (formerly Haswell), Intel Xeon processor E5-xxxx v4 (formerlty Broadwell), and Intel Xeon Phi processor x200 self-boot platform.
- Introduced inner product primitive to support fully connected layers.
- Introduced batch normalization, sum, split, and concat primitives to provide full support for GoogLeNet and ResidualNet topologies.
BLAS:
- Introduced new packed matrix multiplication interfaces (?gemm_alloc, ?gemm_pack ,?gemm_compute, ?gemm_free) for single and double precisions.
- Improved performance over standard S/DGEMM on Intel Xeon processor E5-xxxx v3 and later processors.
Sparse BLAS:
- Improved performance of parallel BSRMV functionality for processor supporting Intel® Advanced Vector Extensions 2 (Intel® AVX2) instruction set.
- Improved performance of sparse matrix functionality on the Intel Xeon Phi processor x200.
Intel MKL PARDISO:
- Improved performance of parallel solving step for matrices with fewer than 300000 elements.
- Added support for mkl_progress in Parallel Direct Sparse Solver for Clusters.
- Added fully distributed reordering step to Parallel Direct Sparse Solver for Clusters.
Fourier Transforms:
- Improved performance of batched 1D FFT with large batch size on processor supporting Intel® Advanced Vector Extensions (Intel® AVX), Intel AVX2, Intel® Advanced Vector Extensions 512 (Intel® AVX512) and IntelAVX512_MIC instruction sets
- Improved performance for small size batched 2D FFT on the Intel Xeon Phi processor x200 self-boot platform, Intel Xeon processor E5-xxxx v3, and Intel Xeon processor E5-xxxx v4.
- Improved performance for 3D FFT on the Intel Xeon Phi processor x200 self-boot platform.
LAPACK
- Included the latest LAPACK v3.6 enhancements. New features introduced are:
  - SVD by Jacobi ([CZ]GESVJ) and preconditioned Jacobi ([CZ]GEJSV)
  - SVD via EVD allowing computation of a subset of singular values and vectors (?GESVDX)
  - In BLAS level 3, generalized Schur (?GGES3), generalized EVD (?GGEV3), generalized SVD (?GGSVD3), and reduction to generalized upper Hessenberg form (?GGHD3)
  - Multiplication of a general matrix by a unitary or orthogonal matrix that possesses a 2x2 block structure ([DS]ORM22/[CZ]UNM22)
- Improved performance for large size QR(?GEQRF) on processors supporting theIntel AVX2 instruction set.
- Improved LU factorization, solve, and inverse (?GETR?) performance for very small sizes (<16).
- Improved General Eigensolver (?GEEV and ?GEEVD) performance for the case when eigenvectors are needed.
- Improved?GETRF, ?POTRF and ?GEQRF, linear solver (?GETRS) and SMP LINPACK performance on the Intel Xeon Phi processor x200 self-boot platform.
ScaLAPACK
- Improved performance for hybrid (MPI + OpenMP*) mode of ScaLAPACK and PBLAS.
- Improved performance of P?GEMM and P?TRSM resulted in better scalability of Qbox First-Principles Molecular Dynamics code.
Data Fitting:
- Introduced two new storage formats for interpolation results (DF_MATRIX_STORAGE_SITES_FUNCS_DERS and DF_MATRIX_STORAGE_SITES_DERS_FUNCS).
- Added Hyman monotonic cubic spline.
- Improved performance of Data Fititng functionality on the Intel Xeon Phi processor x200.
- Modified callback APIs to allow users to pass information about integration limits.
Vector Mathematics:
- Introduced optimizations for the Intel Xeon Phi processor x200.
- Improved performance for Intel Xeon processor E5-xxxx v3 and Intel Xeon processor E5-xxxx v4.
Vector Statistics:
- Introduced additional optimization of SkipAhead method for MT19937 and SFMT19937.
- Improved performance of Vector Statistic functionality including Random Number Generators and Summary Statistic on the Intel Xeon Phi processor x200.

Checkout Online Release notes for more information

topic Announcing Intel MKL 2017 release in Intel® oneAPI Math Kernel Library

Announcing Intel MKL 2017 release