Re:GPU hang when running benchmark_app on GPU of Atom(R) x6211E

Caelan · ‎12-02-2025

Hi,

When running the benchmark_app that is included with the OpenVINO python package, I encounter a GPU hang.

CPU: Intel Atom(R) x6211E Processor @ 1.30GHz
GPU: Intel(R) UHD Graphics (iGPU)
OS: Fedora 41
Kernel: 6.15.9-101.fc41.x86_64
Python: 3.12.11
OpenVINO: 2025.4.0
Compute-runtime: 24.35.30872.32

I ran benchmark_app with the following arguments:
benchmark_app -m rtmdet-tiny-coco.onnx -t 86400 -api sync -hint latency -d GPU

This results in the following error in benchmark_app:

[ ERROR ] Exception from src/inference/src/cpp/infer_request.cpp:224:
Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cpp:424:
[GPU] clWaitForEvents failed with -14 code

Traceback (most recent call last):
  File "/home/zuser/.venv/lib64/python3.12/site-packages/openvino/tools/benchmark/main.py", line 624, in main
    fps, median_latency_ms, avg_latency_ms, min_latency_ms, max_latency_ms, total_duration_sec, iteration = benchmark.main_loop(requests, data_queue, batch_size, args.latency_percentile, pcseq)
                                                                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zuser/.venv/lib64/python3.12/site-packages/openvino/tools/benchmark/benchmark.py", line 181, in main_loop
    times, total_duration_sec, iteration = self.sync_inference(requests[0], data_queue)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zuser/.venv/lib64/python3.12/site-packages/openvino/tools/benchmark/benchmark.py", line 106, in sync_inference
    request.infer()
  File "/home/zuser/.venv/lib64/python3.12/site-packages/openvino/_ov_api.py", line 184, in infer
    return OVDict(super().infer(_data_dispatch(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src/inference/src/cpp/infer_request.cpp:224:
Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cpp:424:
[GPU] clWaitForEvents failed with -14 code

In journalctl I can see the following:

Dec 02 13:30:56 gtx35510d kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Dec 02 13:30:56 gtx35510d kernel: i915 0000:00:02.0: [drm] benchmark_app[2131852] context reset due to GPU hang
Dec 02 13:30:56 gtx35510d kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [2131852]

The hang can take over 24 hours to happen, but most often will happen before then.

I have attached outputs from the following:

benchmark_app
lscpu
clinfo
journalctl
hello_query_device

The website does not allow me to upload my .onnx but you can find it here: https://huggingface.co/bukuroo/RTMDet-ONNX/resolve/main/rtmdet-tiny-coco.onnx?download=true

I was able to reproduce the hang on older versions of Fedora, Kernel, OpenVINO and Compute-runtime.

Please advise.

Wan_Intel · ‎12-03-2025

Hi Caelan,

Thank you for reaching out to OpenVINO™ community.

Referring to your hardware specifications and environments, I noticed that the operating system that you are using is not validated by OpenVINO™.

It’s recommended to use the following system if you want to use GPU plugin:

Windows 11, 64-bit
Windows 10, 64-bit
Ubuntu 24.04 long-term support (LTS), 64-bit
Ubuntu 22.04 long-term support (LTS), 64-bit
Ubuntu 20.04 long-term support (LTS), 64-bit
CentOS 7
Red Hat Enterprise Linux (RHEL) 8 and 9, 64-bit

For more information, please visit https://docs.openvino.ai/2025/about-openvino/release-notes-openvino/system-requirements.html

Regards,

Wan

Caelan · ‎12-04-2025

Hi Wan,

Sorry for the delay, the hang only occurred after almost 24 hours.

I have reproduced the issue with the following setup:

OS: Ubuntu 24.04.3 LTS
Kernel: 6.14.0-36-generic
Python: 3.12.3
OpenVINO: 2025.4.0
Compute-runtime: 23.43.027642

Once again, it was produced using the following call:
benchmark_app -m rtmdet-tiny-coco.onnx -t 86400 -api sync -hint latency -d GPU

I've attached the new output from benchmark_app, clinfo and journalctl.

Thanks,

Caelan

Wan_Intel · ‎12-04-2025

Hi Caelan,

Thank you for sharing the information with us.

We will replicate the issue from our end, and we will provide an update here shortly.

Regards,

Wan

Wan_Intel · ‎12-07-2025

Hi Caelan,

Thank you for your patience.

Referring to your previous reply, I noticed you are using Ubuntu 24.04 LTS with Compute Runtime version 23.43.027642.

For your information, Intel Atom® x6211E Processor (Elkhart Lake – Legacy platform) has been validated on the following environment:

Compute-Runtime version 24.35.30872.22
Ubuntu 22.04 LTS with stock kernel

You may run the benchmark application with the environment above and see if the issue can be resolved. For more information, please refer to Support for legacy platforms.

Regards,

Wan

Caelan · ‎12-09-2025

Hi Wan,

I was able to reproduce the issue on Ubuntu 22.04 LTS with Compute-Runtime 24.35.30872.22.

Thanks,

Caelan

Wan_Intel · ‎12-09-2025

Hi Caelan,

Thank you for sharing the information.

We are checking on this, and we will provide an update here shortly.

Regards,

Wan

Wan_Intel · ‎12-15-2025

Hi Caelan,

I've escalated the case to relevant team, we will further investigate the issue and provide an update here at the earliest.

Regards,

Wan

Caelan · ‎12-16-2025

Thanks for the update!

Wan_Intel · ‎12-22-2025

Hi Caelan,

Thank you for your patience. We have received feedback from relevant team.

After doing some testing with similar Intel Atom® system, we conclude the error seen is related to hardware limitations on legacy Intel Atom® platforms, particularly under long-running workloads, sustained GPU inference with frequent event synchronization, and memory and resource pressure over time. Intel Atom® GPUs are not designed for prolonged and continuous GPU inference workloads. Under such conditions, results like the ones observed may surface.

We were able to replicate the same underlying issue in a similar Intel Atom® system when running inference with GPU. The issue does not occur when running inference on CPU, which aligns with expectations of this class of hardware.

General recommendations we can offer are shorter benchmark durations, asynchronous inference, or reducing the number of inference requests. These to reduce the pressure on the GPU runtime, however, keep in mind that the behavior seen appears to be an expected limitation of the GPU driver/runtime on older Intel Atom® systems when subjected to long-running GPU inference workloads.

Hope it helps.

Regards,

Wan

Caelan · ‎12-29-2025

Hi Wan,

Thanks for the response.

For reference, I initially encountered this issue during a long-running workflow of a specific application that I have:

Acquire new image
Preprocess
Infer on GPU
Process results
Loop back to 1

This looping process is meant to continue indefinitely.

The benchmark app was the simplest way that I found to reliably reproduce the issue for discussion purposes here.

Based on your response, the only solution that works for my application would be to run on CPU instead of GPU?

Also, I don't see any mention of limitations in the system requirements documentation: https://docs.openvino.ai/2025/about-openvino/release-notes-openvino/system-requirements.html
Both the CPU and GPU seem to fall under the supported hardware categories mentioned.

So, how can I know which CPUs/GPUs/NPUs have long-running limitations like this?

Thanks,

Caelan

Wan_Intel · ‎12-30-2025

Hi Caelan,

The issue seems to be a limitation of the GPU driver / runtime on older Intel Atom® systems (Elkhart Lake platform). We regret to inform you that this platform is not supported with the latest driver versions as mentioned in the compute-runtime repository, and no bug fixes or updates will be available for this platform in future releases.

Therefore, the recommended workaround for your application would be to use the CPU plugin instead of the GPU plugin.

For details on driver versions for the supported platforms, you may refer to:

Hope it helps.

Regards,

Wan