Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Announcing Intel MKL 2017 release

Gennady_F_Intel
Moderator
328 Views

Check out the new and the latest Intel® Math Kernel Library (Intel® MKL) 2017 release! 

What's new in Intel MKL 2017:

  • Introduced optimizations for the Intel® Xeon Phi™ processor x200 (formerly Knights Landing ) self-boot platform for Windows* OS
  • Enabled Automatic Offload (AO) and Compiler Assisted Offload (CAO) modes for the second generation of Intel Xeon Phi coprocessor on Linux* OS
  • Introduced Deep Neural Networks (DNN) primitives including convolution, normalization, activation, and pooling functions intended to accelerate convolutional neural networks (CNNs) and deep neural networks on Intel® Architecture.
    • Optimized for Intel® Xeon® processor E5-xxxx v3 (formerly Haswell), Intel Xeon processor E5-xxxx v4 (formerlty Broadwell), and Intel Xeon Phi processor x200 self-boot platform.
    • Introduced inner product primitive to support fully connected layers.
    • Introduced batch normalization, sum, split, and concat primitives to provide full support for GoogLeNet and ResidualNet topologies.
  • BLAS:
    • Introduced new packed matrix multiplication interfaces (?gemm_alloc, ?gemm_pack ,?gemm_compute, ?gemm_free) for single and double precisions.
    • Improved performance over standard S/DGEMM on Intel Xeon processor E5-xxxx v3 and later processors.
  • Sparse BLAS:
    • Improved performance of parallel BSRMV functionality for processor supporting Intel® Advanced Vector Extensions 2 (Intel® AVX2) instruction set.
    • Improved performance of sparse matrix functionality on the Intel Xeon Phi processor x200.
  • Intel MKL PARDISO:
    • Improved performance of parallel solving step for matrices with fewer than 300000 elements.
    • Added support for mkl_progress in Parallel Direct Sparse Solver for Clusters.
    • Added fully distributed reordering step to Parallel Direct Sparse Solver for Clusters.
  • Fourier Transforms:
    • Improved performance of batched 1D FFT with large batch size on processor supporting Intel® Advanced Vector Extensions (Intel® AVX), Intel AVX2, Intel® Advanced Vector Extensions 512 (Intel® AVX512) and IntelAVX512_MIC instruction sets
    • Improved performance for small size batched 2D FFT on the Intel Xeon Phi processor x200 self-boot platform, Intel Xeon processor E5-xxxx v3, and Intel Xeon processor E5-xxxx v4.
    • Improved performance for 3D FFT on the Intel Xeon Phi processor x200 self-boot platform. 
  • LAPACK
    • Included the latest LAPACK v3.6 enhancements. New features introduced are:
      • SVD by Jacobi ([CZ]GESVJ) and preconditioned Jacobi ([CZ]GEJSV)
      • SVD via EVD allowing computation of a subset of singular values and vectors (?GESVDX)
      • In BLAS level 3, generalized Schur (?GGES3), generalized EVD (?GGEV3), generalized SVD (?GGSVD3), and reduction to generalized upper Hessenberg form (?GGHD3)
      • Multiplication of a general matrix by a unitary or orthogonal matrix that possesses a 2x2 block structure ([DS]ORM22/[CZ]UNM22)
    • Improved performance for large size QR(?GEQRF) on processors supporting theIntel AVX2 instruction set.
    • Improved LU factorization, solve, and inverse (?GETR?) performance for very small sizes (<16).
    • Improved General Eigensolver (?GEEV and ?GEEVD) performance for the case when eigenvectors are needed.
    • Improved?GETRF, ?POTRF and ?GEQRF, linear solver (?GETRS) and SMP LINPACK performance on the Intel Xeon Phi processor x200 self-boot platform.
  • ScaLAPACK
    • Improved performance for hybrid (MPI + OpenMP*) mode of ScaLAPACK and PBLAS.
    • Improved performance of P?GEMM and P?TRSM resulted in better scalability of Qbox First-Principles Molecular Dynamics code.
  • Data Fitting:
    • Introduced two new storage formats for interpolation results (DF_MATRIX_STORAGE_SITES_FUNCS_DERS and DF_MATRIX_STORAGE_SITES_DERS_FUNCS).
    • Added Hyman monotonic cubic spline.
    • Improved performance of Data Fititng functionality on the Intel Xeon Phi processor x200.
    • Modified callback APIs to allow users to pass information about integration limits.
  • Vector Mathematics:
    • Introduced optimizations for the Intel Xeon Phi processor x200.
    • Improved performance for Intel Xeon processor E5-xxxx v3 and Intel Xeon processor E5-xxxx v4.
  • Vector Statistics:
    • Introduced additional optimization of SkipAhead method for MT19937 and SFMT19937.
    • Improved performance of Vector Statistic functionality including Random Number Generators and Summary Statistic on the Intel Xeon Phi processor x200.

Checkout Online Release notes for more information

0 Kudos
0 Replies
Reply