CPU Overload Despite Having iGPU: Here's Why?

Vibhu-Bithar · ‎06-08-2026

How to build efficient visual analytics pipelines leveraging iGPU and free-up your CPU

Author: Vibhu Bithar, Lead Platform Architect, Health & Cities Division, Intel
Date: November 2025

The Challenge of Real-Time Video Analytics

Picture this: you’ve got multiple live RTSP camera feed streaming 24/7, and you need to detect objects in real-time. Every frame counts and your CPU is running hot and maxed out trying to keep up.

Meanwhile, your integrated GPU (iGPU) is barely breaking a sweat. Why? Because decoding, preprocessing, and inference are often handled inefficiently on the CPU alone, leaving the iGPU underutilized.

That’s where Intel® Deep Learning Streamer (DL Streamer) comes in. It’s a powerful Framework designed to orchestrate optimized pipelines for real-time computer vision analytics leveraging all the components of your processor including iGPU, NPU and dedicated HW decoders in your processor. This transfers the specialized heavy lifting required for AI to iGPU so your CPU stays available for other tasks.

What Is Intel® Deep Learning Streamer (DL Streamer)?

DL Streamer is an open-source, GStreamer-based framework for building real-time video analytics applications on Intel® hardware.

It simplifies the complex tasks of video decoding, inference, and post-processing into modular pipeline components. DL Streamer integrates tightly with OpenVINO™ Toolkit and supports hardware-accelerated video decode/encode via VA-API, preprocessing via OpenCV / DPC ++, and optimized inferencing on Intel® CPUs, iGPUs, xPUs, and NPUs. (GitHub)

Key Benefits

Hardware acceleration: Optimized for Intel® Core™, Xeon®, Arc™, and Data Center GPU Flex Series devices¹
Seamless integration: Works with the OpenVINO™ toolkit for efficient inference
Scalable design: Supports multiple simultaneous RTSP feeds
Flexible architecture: Modular GStreamer elements for custom AI pipelines

In short, DL Streamer helps developers focus on outcomes, not boilerplate code.

Why DL Streamer Matters

Modern AI workloads demand real-time performance, low latency, and efficient hardware utilization, especially at the edge. DL helps achieve this by moving decoding preprocessing, and inference workloads onto the iGPU.

Why You Should Care

Efficiency: Offload CPU-heavy tasks to iGPU
Performance: Reduce CPU usage and memory transfers
Simplicity: Build complex pipelines with just a few GStreamer elements
Scalability: Support multiple camera feeds with minimal tuning

The Common Problem: CPU Overload Despite Having an iGPU

Let’s explore how a simple RTSP object detection pipeline evolves, from CPU-bound to fully iGPU accelerated.

Pipeline 1: CPU-Only (Baseline)

rtspsrc location=rtsp://XX.XX.XX.XX:554/yourcam1 ! decodebin ! videorate ! videoconvert ! video/x-raw,format=BGR,framerate=20/1 ! videoscale ! video/x-raw,width=640,height=480 ! gvadetect device=CPU model-instance-id=detect1 inference-interval=1 model=/home/models/object_detection/ITS_CL_FP16/openvino.xml ! autovideosink=true

In this version, the entire decoding and inference workload runs on the CPU. It works, but your CPU does everything. Expect high utilization and latency.

Pipeline 2: Inference on GPU

rtspsrc location=rtsp://XX.XX.XX.XX:554/yourcam1 latency=15 ! decodebin ! videorate ! videoconvert ! video/x-raw,format=BGR,framerate=20/1 ! videoscale ! video/x-raw,width=640,height=480 ! gvadetect device=GPU model-instance-id=detect1 inference-interval=1 model=/home/models/object_detection/ITS_CL_FP16/openvino.xml ! autovideosink=true

Inferencing now runs on the iGPU, but decoding still runs on the CPU. This creates unnecessary back-and-forth between CPU and GPU memory, reducing the performance gains you’d expect.

Pipeline 3: Full GPU Acceleration (Decode + Preprocess + Inference)

rtspsrc location=rtsp://XX.XX.XX.XX:554/yourcam1 ! decodebin3 ! videorate ! video/x-raw(memory:VAMemory),framerate=15/1 ! vapostproc ! video/x-raw(memory:VAMemory),width=1920,height=1080 ! gvadetect device=GPU model-instance-id=detect1 inference-interval=1 model=/home/models/object_detection/ITS_CL_FP16/openvino.xml pre-process-backend=va-surface-sharing ! autovideosink=true

In this optimized version, the entire RTSP stream decode, preprocess, and inference, runs on the GPU, minimizing latency and freeing your CPU almost entirely.

Key DL Streamer Components That Make It Work (RTSP Feed in Focus)

Each DL Streamer element plays a unique role in enabling smooth, hardware-accelerated RTSP video analytics.

The Result: A Truly Accelerated RTSP Pipeline

By moving decoding, preprocessing, and inference fully onto the iGPU, you minimize CPU overhead, reduce power consumption, and achieve true real-time performance, even with multiple camera streams.

Why It Matters for Edge and Transportation Use Cases

At the edge, where systems operate in harsh, power-and temperature constrained environments, every watt, millisecond, and CPU cycle counts. Whether it’s a roadside cabinet, transit hub, or industrial site, compute resources are limited, and reliability is non-negotiable. DL Streamer enables you to fully leverage Intel’s integrated GPU for decoding, preprocessing, and inference directly on device, eliminating unnecessary CPU load and memory transfers. The result is higher power efficiency, lower thermal stress, and real-time performance without costly discrete GPU or cloud dependency. In short DL Streamer helps bring scalable, efficient, and resilient AI to where it matters most – the edge.

Why You Should Care

Unlock iGPU performance with minimal effort
Simplify development with DL Streamer’s modular, plug-and-play design
Reduce CPU load and power usage dramatically
Scale video analytics across multiple streams seamlessly

DL Streamer bridges the gap between your vison data and real-world hardware-optimized AI performance.

Learn More

¹Attribution Facts and specifications referenced from the official Intel® DL Streamer GitHub repository and documentation, including release notes and README files as of 2025. DL Streamer is optimized for Intel® Core™, Xeon®, Arc™, and Data Center GPU Flex Series devices through integration with OpenVINO™ Toolkit and hardware-accelerated GStreamer components.