<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Intel Arc GPU in GPU Compute Software</title>
    <link>https://community.intel.com/t5/GPU-Compute-Software/Intel-Arc-GPU/m-p/1749483#M2373</link>
    <description>&lt;UL&gt;&lt;LI&gt;Problem: When running sustained AI inference (e.g., Qwen3-Embedding-4B via OpenVINO) on an Intel Arc Pro 140T, the GPU driver crashes after a few minutes at &amp;gt;90% utilization — leading to kernel panic, segfault, or NaN outputs. Lowering batch size or quantization doesn’t help as long as load stays high.&lt;/LI&gt;&lt;LI&gt;Hypothesis: The crashes appear to be triggered by sustained power/thermal stress, not by numerical precision. Under continuous heavy load, the GPU/driver becomes unstable. The fact that adding forced cooldown intervals eliminates crashes supports this (thermal/power limitation, not a software bug).&lt;/LI&gt;&lt;LI&gt;Result: After inserting small batch sizes (10), cooldown breaks every 3 batches (5s), and latency-based thermal detection, the system runs without any crash at ~350s per 1000 texts (down from ~60s before) — but stable. Without these breaks, the same INT8 model crashes within 1 minute.&lt;/LI&gt;&lt;/UL&gt;</description>
    <pubDate>Fri, 29 May 2026 16:07:03 GMT</pubDate>
    <dc:creator>PlanteAmigor</dc:creator>
    <dc:date>2026-05-29T16:07:03Z</dc:date>
    <item>
      <title>Intel Arc GPU</title>
      <link>https://community.intel.com/t5/GPU-Compute-Software/Intel-Arc-GPU/m-p/1749483#M2373</link>
      <description>&lt;UL&gt;&lt;LI&gt;Problem: When running sustained AI inference (e.g., Qwen3-Embedding-4B via OpenVINO) on an Intel Arc Pro 140T, the GPU driver crashes after a few minutes at &amp;gt;90% utilization — leading to kernel panic, segfault, or NaN outputs. Lowering batch size or quantization doesn’t help as long as load stays high.&lt;/LI&gt;&lt;LI&gt;Hypothesis: The crashes appear to be triggered by sustained power/thermal stress, not by numerical precision. Under continuous heavy load, the GPU/driver becomes unstable. The fact that adding forced cooldown intervals eliminates crashes supports this (thermal/power limitation, not a software bug).&lt;/LI&gt;&lt;LI&gt;Result: After inserting small batch sizes (10), cooldown breaks every 3 batches (5s), and latency-based thermal detection, the system runs without any crash at ~350s per 1000 texts (down from ~60s before) — but stable. Without these breaks, the same INT8 model crashes within 1 minute.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 29 May 2026 16:07:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/GPU-Compute-Software/Intel-Arc-GPU/m-p/1749483#M2373</guid>
      <dc:creator>PlanteAmigor</dc:creator>
      <dc:date>2026-05-29T16:07:03Z</dc:date>
    </item>
  </channel>
</rss>

