Visual Efficiency for Intel’s GPUs

Anton_Kaplanyan · ‎06-17-2025

Visual Efficiency for Intel’s GPUs

At Intel, we are reimagining the future of graphics—making it more nimble, accessible, and power-efficient. As the landscape of GPU technology rapidly evolves, we’re leading the charge to making premium visual experiences available on more GPUs, including low-power and mobile GPUs.

The role of GPUs is undergoing a fundamental shift. The line between discrete and integrated or built-in GPUs are becoming increasingly blurred thanks to changes in architecture and performance, and the use of advanced packaging. Long gone are the days of Intel GPUs being limited to video playback or web browsing— Intel Arc GPUs now support DirectX 12 Ultimate, ray tracing, and neural graphics acceleration across our product lines, scaling high-end gaming capabilities to many more systems.

This transformation is especially critical in a world that’s going mobile. Lightweight laptops and gaming handhelds—like the MSI Claw powered by Intel’s platform codenamed Lunar Lake—are redefining how and where users expect to game. Our mission is to deliver the immersive, high-fidelity visuals traditionally reserved for large desktop systems to these low-power, highly portable devices.

To achieve this, our research is pushing the boundaries of graphics innovation. At SIGGRAPH 2025 and HPG 2025, we are showcasing efficiency improvements for path tracing, neural graphics, and new physically based effects like fluorescence, all aiming for lower power GPUs. We're also developing novel compression techniques that dramatically reduce power and bandwidth consumption—key enablers for delivering rich visual experiences on low-power platforms.

Intel is leading a new paradigm: premium graphics everywhere—without compromise.

Efficiency for Heavy Visuals

Heavy visuals, such as path tracing, physically accurate spectral effects, and neural graphics are premium effects typically available only on high-end gamer platforms. At Intel, we are building steppingstones towards providing these premium visual experiences on more form factors, including platforms with built-in GPUs. Particularly, path tracing is a photorealistic rendering method that demands a lot of compute power due to photon paths being simulated (rendered) using the laws of physics. In low-power settings, we cannot afford to simulate too many photons, so the photons to be simulated need to be carefully selected (sampled). Even then the produced images exhibit noise and need to be denoised. This year, Intel is providing improvements to both the state-of-the-art sampling and denoising methods for path tracing.

Histogram Stratification for Spatio-Temporal Reservoir Sampling

The work, accepted to SIGGRAPH 2025, improves real-time path tracing by enhancing Resampled Importance Sampling. Samples are organized into local histograms and Quasi Monte Carlo sampling with antithetic patterns is employed, reducing noise with minimal overhead. Combined with blue noise, this method significantly improves visual quality, achieving up to 10× better results.

This work improves on top of the current state of the art used in AAA games like Cyberpunk 2077 and brings high-end experiences closer to our low-power hardware.

Intel Open Image Denoise 2 and Beyond: AI-Accelerated Ray Tracing for Everyone

Another important piece of the efficient path tracing puzzle is the denoising algorithm, which rids the image of the remaining noise after the rendering simulation. Since its launch in 2019, Intel Open Image Denoise has become one of the most widely adopted AI-based denoising solutions for path tracing, culminating in recently receiving an Academy Sci-Tech Award for its contribution to filmmaking. The success of the library can be largely attributed to its open-source nature, high performance and image quality, rich cross-vendor support, and simple API, making it unique among denoisers in the industry.

Initially conceived as a CPU denoiser, Intel Open Image Denoise 2 has introduced optimized cross-vendor support for all major GPUs, as well as enabling real-time performance on various hardware architectures. The next major version is now under development, which aims to bring both quality and performance to a new level by employing a more efficient neural network architecture, temporal denoising, and many other improvements. In his HPG 2025 talk, Attila T. Afra will delve into the technical details of both current and upcoming features and improvements of the next-generation Intel Open Image Denoise.

Fluorescent Material Model for Non-Spectral Editing & Rendering

While chasing efficiency in current effects such as path tracing is important to us, we also do not forget to look at enabling new effects, such as fluorescence, for efficient rendering. This work, accepted to SIGGRAPH 2025, presents a new analytical method for rendering and editing fluorescent materials in non-spectral engines. Unlike previous approaches that rely on stored data for each material, this method uses a Gaussian-based model to represent fluorescence, allowing for real-time editing and dynamic variation. It enables realistic rendering, especially with UV input, and even supports the easy creation of plausible fluorescent materials from basic color inputs.

With this paper we continue our efforts to simulate complex optical effects without resorting to fully spectral simulations. Such spectral simulations are much more expensive to compute than what we propose. In contrast, our method allows us to simulate fluorescent effects in real-time, including on a built-in GPU.

CGVQM+D: Computer Graphics Video Quality Metric and Dataset

When pursuing efficiency, it is important to make sure the perceived visual quality remains the same. Measuring the quality of visuals perceived by human is a very challenging problem but also a very important one, because other than keeping the quality bar, a good measure can also suggest where computational efficiency can be gained without lowering the perceived quality.

While existing video and image quality datasets have extensively studied natural videos and traditional distortions, the perception of synthetic content and modern rendering artifacts remains underexplored. We present a novel video quality dataset focused on distortions introduced by advanced rendering techniques, including neural supersampling, novel-view synthesis, path tracing, neural denoising, frame interpolation, and variable-rate shading. Our evaluations show that existing full-reference quality metrics perform sub-optimally on these distortions, with a maximum Pearson correlation of 0.78. Additionally, we find that the feature space of pre-trained 3D CNNs aligns strongly with human perception of visual quality. We propose CGVQM, a full-reference video quality metric that significantly outperforms existing metrics while generating both per-pixel error maps and global quality scores. The dataset and metric source code will be made available to the public.

Accelerating Volumetric Path Tracing with Tetrahedral Grids

As a SIGGRAPH 2025 short talk, we advertise the use of tetrahedral grids constructed via the longest edge bisection algorithm for rendering volumetric data with path tracing. The key benefits of such grids is two-fold. First, they provide a highly adaptive space-partitioning representation that limits the memory footprint of volumetric assets. Second, each (tetrahedral) cell has exactly 4 neighbors within the volume (one per face of each tetrahedron) or less at boundaries. We leverage these properties to devise optimized algorithms and data-structures to compute and path-trace adaptive tetrahedral grids efficiently on GPU. In practice, our GPU implementation outperforms regular grids by up to ×30 and renders production assets in realtime at 32 samples per pixel, providing another efficient building block for running complex effects on mainstream and built-in GPUs.

Efficient Compression for Low-power GPUs

Hardware-accelerated Texture Set Neural Compression (TSNC) with DirectX Cooperative Vectors

In this work, we present an extension to the neural texture compression method of Weinreich and colleagues [WDOHN24]. Like them, we leverage existing block compression methods which permit the use of hardware texture filtering to store a neural representation of physically based rendering (PBR) texture sets (including albedo, normal maps, roughness, etc.). However, we show that low dynamic range block compression formats still make the solution viable. Thanks to this, we show that we can achieve higher compression ratio or higher quality at fixed compression ratio. We improve performance at runtime using a tile-based rendering architecture that leverages the hardware matrix multiplication engine found on modern GPUs. Our neural texture compression technology allows us to render 4K texture sets (9 channels per asset) with anisotropic filtering at 1080p using only 28MB of VRAM per texture set. Performance on Intel built-in and discrete GPUs is as follows:

Intel® Arc™ 140V (Lunar Lake): 2.6ms (BC6 baseline) / 2.1ms (TSNC with Cooperative Vectors)
Intel® Arc™ B580 (Battlemage): 0.55ms (BC6 baseline) / 0.55ms (TSNC with Cooperative Vectors)

Here, TSNC performs on par or better than the regular BC6 compression at a fraction of the texture memory footprint required.

Image-GS: Content-Adaptive Image Representation via 2D Gaussians

Image-GS, another paper accepted to SIGGRAPH 2025, is an efficient, explicit, and content-adaptive image representation based on anisotropic 2D Gaussians. The image above delivers remarkable visual quality and memory efficiency while supporting fast random access and a natural level-of-detail stack.

While an early work, an alternative representation to textures can pave the way to the next generation improvement in bandwidth and power on built-in GPUs, allowing for higher resolution visuals and longer battery life.

Beyond Publications

In addition to scientific papers, our team has many exciting updates, such as courses and awards, as well as our progress towards efficient real-time path tracing.

OpenPGL Talk in Path Guiding in Production and Recent Advancements

Adaptive path sampling algorithms like path guiding enable efficient rendering of complex lighting effects essential for high-fidelity imagery. We organize this new course with the industry leaders, which explores how these algorithms—focusing on local importance sampling—are integrated into leading production renderers such as Cycles, VRay, Corona, Karma, and Hyperion, highlighting both full-scene and effect-specific guiding strategies while addressing the often-overlooked implementation challenges.

Fluorescence Talk in Physically Based Shading Course

The paper on Fluorescence will also be presented at the SIGGRAPH course “Physically Based Shading in Theory and Practice”, where Laurent Belcour will discuss practical aspects of the implementation and integration considerations.

SIGGRAPH Test of time Award for Intel Embree

Intel Embree, a foundational Oscar-winning ray tracing library has received the ACM SIGGRAPH Test-of-Time Award this year. This prestigious award recognizes highly influential papers published in SIGGRAPH conferences that have made a significant impact over the past 10 years or more. The work was first published at SIGGRAPH 2014 as a paper “Embree: A Kernel Framework for Efficient CPU Ray Tracing” by Ingo Wald, Sven Woop, Carsten Benthin, Gregory Johnson, Manfred Ernst. We congratulate the team on this amazing award that once again emphasizes the importance and the foundation of graphics at Intel.

Technical Oscar Award for Intel Open Image Denoise

Intel Open Image Denoise has won this year’s Academy Scientific and Technical Award with the following recognition “Open Image Denoise is an open-source library that provides an elegant API and runs on diverse hardware, leading to broad industry adoption. Its core technology is provided by the widely adopted U-Net architecture that improves efficiency and preserves detail, raising the quality of CG imagery across the industry.”

Academy Software Foundation (ASWF) Open Source Days 2025

At this year’s ASWF Open Source Days, co-located with SIGGRAPH, Intel will showcase advancements in SYCL—an open standard for cross-vendor GPU programming—highlighting its potential for the VFX industry to write single-source, portable code that runs efficiently on GPUs from different vendors. The presentation includes a case study using Blender’s Cycles renderer with Intel’s oneAPI SYCL backend, demonstrating performance that closely matches native CUDA and HIP backends, making SYCL a compelling “write once, deploy many” solution for GPU-accelerated workloads.

Conclusion

Our mission is to bring high-quality visual experiences to more platforms, including low-power form factors. Meanwhile, we hope gamers will appreciate our mobile and handheld offerings based on Intel Core Ultra Series 2 with built-in Arc GPU, our discrete GPU flagship Arc B-Series products and our next generation platforms. We are motivated to keep moving the needle in visual efficiency to enable the best possible imagery on our discrete and built-in GPUs, including technologies previously only available on high-end GPUs, such as Texture Set Neural Compression.