oneAPI Registration, Download, Licensing and Installation
Support for Getting Started questions related to download, Installation and licensing for Intel oneAPI Toolkits and software development tools.
1609 Discussions

AVAILABLE NOW! Intel Software Developer Tools 2025.1

Devorah_H_Intel
Moderator
205 Views

Intel Software Developer Tools 2025.1 Highlights

AI Tools

  • Intel® Extension for PyTorch* upgraded to version 2.6:
  • Intel® Neural Compressor upgraded to version 3.2:
  • ONNX runtime updated to 1.20.1
  • JAX updated to 0.5.2

Intel® oneAPI Base Toolkit

  • The Intel® oneAPI DPC++/C++ Compiler, which already offers CPU MemorySanitizer support, now extends this capability to the device side, including GPUs. This enhancement allows you to easily detect and troubleshoot issues in both CPU and device code, ensuring more reliable applications.
  • The Intel® oneAPI DPC++/C++ Compiler now supports ccache* to significantly speed up your build times. By caching previous compilations and reusing them, developers can experience faster iterations and more efficient workflows, allowing you to focus on writing high-quality code rather than waiting for builds.
  • The Intel® oneAPI DPC++/C++ Compiler's code coverage tool now includes GPU support and enhanced CPU coverage for applications using C/C++, SYCL, and OpenMP. It offers you detailed analysis and comprehensive HTML reports to identify tested and untested code sections, ultimately improving test coverage and code quality while ensuring easy integration into workflows.
  • The Intel® oneAPI DPC++/C++ Compiler’s integrated support for Altera FPGA has been removed as of the 2025.1 release. Altera* will continue to provide FPGA support through their dedicated FPGA software development tools. Existing customers can continue to use the Intel® oneAPI DPC++/C++ Compiler 2025.0 release which supports FPGA development and is available through Linux* package managers such as APT, YUM/DNF, or Zypper. Additionally, customers with an active support license can access the Intel® oneAPI DPC++/C++ Compiler 2025.0 via their customer support account.
    For more information and assistance with transitioning to the Altera development tools, please contact your Altera representative.
  • GPU kernels run up to 3x faster for algorithms including copytransformorder-changinggeneration and set operations with Intel® oneAPI DPC++ Library (oneDPL), which also adds support for copytransform and merge range-based C++20 standard algorithms
  • Developers now have more random number distribution choices with the addition of geometric distributions to the already long list supported by the Intel oneAPI Math Kernel Library.
  • Developers can experience faster workload execution via Fast Fourier Transform performance improvements for certain cases on Intel discrete and integrated GPUs.
  • Intel® oneAPI Threading Building Blocks (oneTBB) enables parallel processing of tensors in any dimensions, with the new blocked_nd_range feature to expand support from 3 to N dimensional
  • Intel® VTune™ Profiler
    • Identify performance bottlenecks of AI workloads that are calling DirectML or WinML APIs.
    • Understand the overall accelerator performance by seeing GPU and NPU offload bottlenecks in one view. 

    • Pinpoint the most time-consuming code sections and critical code paths for Python 3.12.
  • Easily migrate CUDA code to SYCL with the Intel® DPC++ Compatibility tool automatically migrating 158 more APIs used by popular AI and accelerated computing apps
  • Gaming, graphics, and content creation developers and ISVs – deliver real-time visual AI and gaming experiences when using C++ with SYCL for GPU acceleration​

    • Enhanced SYCL interoperability with Vulkan and DirectX12 enables the sharing of image texture map data directly from the GPU, eliminating extra image copying between CPU and GPU, ensuring seamless performance image processing and advanced rendering applications, and boosting content creation productivity
  • The Intel Distribution for GDB* rebases to GDB* 15.2 staying current and aligned with the latest enhancements supporting effective application debug.
  • The Intel Distribution for GDB* now includes scheduler-locking IDE options to step by default in VSCode* when debugging on a Linux* machine, providing more precise control and a enhanced debugging experience.
  • Intel® Distribution for GDB* adds support for Intel® Core™ Ultra processors (Series 2) and Intel® Arc™ B-Series Graphics on Linux*, in addition to the existing Windows support, allowing developers to efficiently debug application code on these new CPUs and GPUs.
  • Boost your image processing applications with Intel® Integrated Performance Primitives' in-place MulC function, now supports image sizes up to 512 Megapixels for high-performance, low-memory overhead image scaling

  • Optimize your 5G signal processing with Intel® Integrated Performance Primitives' enhanced Discrete Fourier Transform (DFT), specifically tuned for the critical demands of cellular systems

  • Develop more secure and efficient applications using Intel® Cryptography Primitives library SM4 and SHA-512 hash algorithms optimized for Intel® Core Ultra 200V, Intel® Core Ultra 200S and upcoming Intel® Xeon® Scalable processor
  • Developers can enhance matrix multiplication and convolution performance with oneDNN, optimized for Intel® Xeon® processors. These improvements leverage the Intel AMX instruction set, making it ideal for datacenter AI workloads, ensuring your applications run efficiently on the latest Intel architectures.
  • Developers running AI inference on client CPUs can unlock improved performance with oneDNN on Intel Arc Graphics. This optimization maximizes the capabilities of Core Ultra processors (Series 2) and Intel Arc B-series discrete graphics, ensuring your AI applications execute with greater speed and efficiency, enhancing your development efforts.
  • Developers can optimize AI models with improved performance for Gated Multi Level Perceptron (Gated MLP) and Scaled Dot-Product Attention (SDPA) with implicit casual mask using oneDNN. Support for int8 or int4 compressed key and value through the Graph API enhances both speed and efficiency, enabling more powerful and responsive AI applications.
  • Developers are able to utilize collectives in more operations with added support for Average in the Allreduce and Reduce-Scatter collectives in oneCCL.
  • oneCCL gives developers more control over collective communications with extensions to the Group API supporting collective operations and a new API to split a communicator.
  • Users are able to scale up more effectively with oneCCL performance optimizations for Alltoall.

Intel® oneAPI HPC Toolkit

  • The Intel HPC Toolkit contains updates included in the 2025.1 update for the Intel oneAPI Base Toolkit, plus the following:

    • The Intel® Fortran Compiler now extends its CPU MemorySanitizer support to the device side, including GPUs. This enhancement allows Fortran developers to easily detect and troubleshoot issues in both CPU and device code, ensuring more reliable and robust applications.
    • The Intel® Fortran Compiler expands its OpenMP 6.0 standard support by introducing the WORKDISTRIBUTE construct to efficiently distribute work across threads and the INTERCHANGE construct to reorder loops in a loop nest, boosting parallel performance and code optimization.
    • The Intel® Fortran Compiler enhances its Fortran 23 support by ensuring consistent kind types for integer arguments in the SYSTEM_CLOCK intrinsic and allowing PUBLIC NAMELIST groups to include PRIVATE variables, providing developers with improved conformance to Fortran 23 language standard and greater code flexibility
    • Intel MPI Library applications can now take advantage of new performance tuning for Intel® Xeon® 6 Processors with Performance-Cores (P-Cores), including thread split hand-off for scale-up improvements, CPU inference optimizations for DeepSpeed, and optimizations for point-to-point shared memory operations.
    • Intel MPI Library implemented improvements to the default pinning algorithms to improve out-of-the-box resource utilization with benefits for homogeneous and hybrid architecture systems.

    • Intel MPI Library now supports device-initiated MPI-RMA functions on supported GPUs in advance of the MPI Standard.
    • NEW PRODUCT:  Intel® SHMEM! 
      • Developers have access to a complete Intel SHMEM specification detailing the programming model and supported API calls, with example programs, build and run instructions, and more.
      • Developers using Intel SHMEM can target both device and host with OpenSHMEM 1.5 and 1.6 features including point-to-point Remote Memory Access (RMA), Atomic Memory Operations (AMO), Signaling, Memory Ordering, Teams, Collectives, Synchronization operations, and strided RMA operations.
      • Developers are able to utilize Intel SHMEM and SYCL with API support on device for SYCL work-group and sub-group level extensions of RMA, Signaling, Collective, Memory Ordering, and Synchronization operations and API support on host for SYCL queue ordered RMA, Collective, Signaling, and Synchronization operations.

Download Intel® oneAPI Toolkits TODAY!

0 Kudos
0 Replies
Reply