Intel Arc GPU

PlanteAmigor — Fri, 29 May 2026 16:22:43 GMT

PyTorch XPU backward pass crash with Transformer/SDPA on Intel Arc iGPU

Environment:

CPU: Intel Core Ultra 9 285H (Meteor Lake)

GPU: Intel Arc iGPU (8 Xe-core, shared memory, 128GB DDR5)

OS: Linux (Ubuntu 24.04)

PyTorch: 2.12.0+xpu

Intel oneAPI XPU driver: latest

Problem:
I'm experiencing a crash during the backward pass of nn.TransformerEncoderLayer (or F.scaled_dot_product_attention) when running on Intel XPU. The forward pass works fine, but loss.backward() crashes with memory allocation errors or segfaults.

Minimal repro:
python

import torch
import torch.nn as nn

m = nn.TransformerEncoderLayer(2048, 16, batch_first=True).to('xpu')
x = torch.randn(8, 512, 2048, device='xpu')
m(x).sum().backward() # crash

Error message (varies each run, values like -7.9e16 to -5.0e17, looks like integer overflow):
text

RuntimeError: Trying to create tensor with negative dimension -79243236477491020: [-79243236477491020]

Sometimes also:

IndexError: select(): index -1 out of range for tensor of size [0] at dimension 0

In severe cases (e.g., when AMP BF16 is enabled), the entire system freezes and requires a hard reboot — the GPU driver itself crashes, not just the Python process.

Observations:

Same code runs perfectly on CPU (device='cpu').

CNN operations (Conv2d, Linear, BatchNorm) work fine on XPU — only attention backward triggers this.

Forward pass is always fine, only loss.backward() crashes.

Not always reproducible with tiny models (batch=2, hidden=512), but almost guaranteed with larger sizes (batch=8, hidden=2048).

System freeze (driver crash) happens with AMP BF16 enabled.

Things I've tried that didn't help:

Replacing nn.MultiheadAttention with F.scaled_dot_product_attention

AMP BF16 (made it worse — system freeze)

Periodic torch.xpu.empty_cache() + gc.collect() (delays but doesn't prevent)

torch.xpu.synchronize() before/after backward

Question:
Is this a known PyTorch XPU backend bug, an Intel oneAPI driver issue, or something wrong with my setup? Any known fixes or workarounds would be greatly appreciated.

topic Intel Arc GPU in GPU Compute Software

Intel Arc GPU