oneAPI 2023.0 release is AVAILABLE NOW!
Optimized, Standards-Based Support for Powerful New Architectures
The latest oneAPI and AI 2023 tools continue to empower developers with multi-architecture performance and productivity, delivering optimized support for Intel’s upcoming portfolio of CPU and GPU architectures and advanced capabilities:
- 4th Gen Intel® Xeon® Scalable Processors (formerly codenamed Sapphire Rapids) with Intel®
- Advanced Matrix Extensions (Intel® AMX), Quick Assist Technology (QAT), Intel® AVX-512, bfloat16, and more
- Intel® Xeon® Processor Max Series high-bandwidth memory
- Intel® Data Center GPUs, including Flex Series with hardware AV1 encode and Max Series (formerly codenamed Ponte Vecchio) with datatype flexibility, Intel® Xe Matrix Extensions (Intel® XMX), vector engine, XE-Link, and other features
- Existing Intel® CPUs, GPUs, and FPGAs
The Highlights: What’s New in the 2023 oneAPI and AI Tools for Linux* and Windows*
Compilers & SYCL Support
- Intel® oneAPI DPC++/C++ Compiler improves CPU and GPU offload performance and broadens SYCL language support for improved code portability and productivity.
- Intel® oneAPI DPC++ Library expands support of the C++ standard library in SYCL kernels with additional heap and sorting algorithms and adds the ability to use OpenMP for thread-level parallelism.
- Intel® DPC++ Compatibility Tool (based on the open source SYCLomatic project) improves migration of CUDA library APIs, including those for runtime and drivers, cuBLAS, and cuDNN.
- Intel® Fortran Compiler implements coarrays, eliminating the need for external APIs such as MPI or OpenMP, expands OpenMP 5.x offloading features, adds DO CONCURRENT GPU offload, and improves optimizations for source-level debugging.
- Intel® oneAPI Math Kernel Library increases CUDA library function API compatibility coverage for BLAS and FFT. For Sapphire Rapids, oneMKL leverages Intel® XMX to optimize matrix multiply computations for TF32, FP16, BF16, and INT8 data types. oneMKL provides interfaces for SYCL and C/Fortran OpenMP offload programming.
- Intel® oneAPI Threading Building Blocks improves support and use of the latest C++ standard for parallel_sort, offers an improved synchronization mechanism to reduce contention when multiple task_arena calls are used concurrently, and adds support for Microsoft Visual Studio 2022 and Windows Server 2022.
- Intel® oneAPI Video Processing Library supports the industry’s only hardware AV1 codec in the Intel Data Center GPU Flex Series and Intel® Arc™ processors, expands OS support for RHEL9, CentOS, Stream 9, SLES15 SP4, and Rocky 9 Linux, and adds parallel encoding feature to sample_multi_transcode.
Analysis & Debug
- Intel® VTune™ Profiler enables ability to identify MPI imbalance issues via its Application Performance Snapshot feature and adds support for Sapphire Rapids, Ponte Vecchio, and 13th Gen Intel® Core™ processors.
- Intel® Advisor adds automated roofline analysis for Intel Data Center GPU MAX Series to identify and prioritize memory, cache, or compute bottlenecks and understand their causes, and delivers actionable recommendations for optimizing data-transfer reuse costs of CPU-to-GPU offloading.
AI and Analytics
- Intel® AI Analytics Toolkit can now be run natively on Windows with full parity to Linux except for distributed training (GPU support is coming in early 2023).
- Intel® oneAPI Deep Neural Network Library further supports delivery of superior CNN performance by enabling advanced features in 4th Gen Intel Xeon Scalable Processors, including Intel AMX, AVX-512, VNNI, and bfloat16.
- Intel® Distribution of Modin integrates with new heterogeneous data kernels (HDK) solution in the back end, enabling AI solution scale from low-compute resources to large- or distributed-computed resources.
The Highlights: What’s New in the 2023 oneAPI for macOS*
- Intel® oneAPI Threading Building Blocks (oneTBB) improves support and use of the latest C++ standard for parallel_sort that allows using this algorithm with user-defined and standard library-defined objects with modern semantics.
- oneTBB now offers an improved synchronization mechanism to reduce contention when multiple task_arena’s are used concurrently, allowing for task_arena’s to be more independent and more task_arena’s to execute simultaneously/concurrently without performance impact.
- The latest Intel® Integrated Performance Primitives release adds optimization for lossless compression method, zlib 1.2.13 in Intel® IPP Data compression. These new optimizations help improve the quality and speed of compression/decompression, in various data compression applications.
- The Intel® C++ Compiler Classic has been updated to include recent versions of 3rd party components, which include functional and security updates.
- The Intel® Fortran Compiler Classic has been updated to include recent versions of 3rd party components, which include functional and security updates.