<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hard crash using OneAPI on Intel ARC A750 to train Pytorch model in Intel® Optimized AI Frameworks</title>
    <link>https://community.intel.com/t5/Intel-Optimized-AI-Frameworks/Hard-crash-using-OneAPI-on-Intel-ARC-A750-to-train-Pytorch-model/m-p/1584029#M490</link>
    <description>&lt;P&gt;I am experiencing a HARD crash while training using code from the Intel Extensions for Pytorch.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have the following configuration on my PC:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;OS Name Microsoft Windows 11 Pro&lt;/P&gt;&lt;P&gt;Version 10.0.22631 Build 22631&lt;/P&gt;&lt;P&gt;Processor Intel(R) Core(TM) i7-14700&lt;/P&gt;&lt;P&gt;BIOS Version/Date American Megatrends Inc. 1604, 12/15/2023&lt;/P&gt;&lt;P&gt;BaseBoard Manufacturer ASUSTeK COMPUTER INC.&lt;/P&gt;&lt;P&gt;BaseBoard Product ROG STRIX B760-I GAMING WIFI&lt;/P&gt;&lt;P&gt;Installed Physical Memory (RAM) 32.0 GB&lt;/P&gt;&lt;P&gt;Display Adapter Intel(R) Arc(TM) A750 Graphics&lt;/P&gt;&lt;P&gt;Display Driver Version 31.0.101.5333&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I installed the following Software:&lt;/P&gt;&lt;P&gt;Intel OneAPI Base Toolkit for Windows&amp;nbsp;&lt;SPAN&gt;2024.1.0&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; (includes Intel® oneAPI Math Kernel Library&amp;nbsp;2024.1.0)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I also installed the Intel Extensions for Pytorch from&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;A href="https://developer.intel.com/ipex-whl-stable-xpu" target="_blank" rel="noopener"&gt;https://developer.intel.com/ipex-whl-stable-xpu&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Versions installed are:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;torch&lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt;2.1.0a0+cxx11.abi&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;torchvision&lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt;0.16.0a0+cxx11.abi&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;intel_extension_for_pytorch&lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt;2.1.10+xpu&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;While training on the included file (see attachments) my PC crashed each time I attempted the training. This was a HARD crash, No Windows Blue Screen displayed, PC shutdown, no log files that I could see. CPU/GPU temps were high during training, but not abnormally high. I have my PC set to limit the CPU power to 90C temp and the GPU has a 180W power limit and 85C temp limit set in Intel ARC Control.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I am just learning AI/Computer Vision and this work isn't critical. However, I thought I would report the problem as it may help debug Intel products.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I attached the Python file that reproduced this crash. Rename the .7z file to .py if attempting to reproduce. The Python file is from the Intel Git repo for the pytorch extensions.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Wed, 27 Mar 2024 22:16:48 GMT</pubDate>
    <dc:creator>laduran</dc:creator>
    <dc:date>2024-03-27T22:16:48Z</dc:date>
    <item>
      <title>Hard crash using OneAPI on Intel ARC A750 to train Pytorch model</title>
      <link>https://community.intel.com/t5/Intel-Optimized-AI-Frameworks/Hard-crash-using-OneAPI-on-Intel-ARC-A750-to-train-Pytorch-model/m-p/1584029#M490</link>
      <description>&lt;P&gt;I am experiencing a HARD crash while training using code from the Intel Extensions for Pytorch.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have the following configuration on my PC:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;OS Name Microsoft Windows 11 Pro&lt;/P&gt;&lt;P&gt;Version 10.0.22631 Build 22631&lt;/P&gt;&lt;P&gt;Processor Intel(R) Core(TM) i7-14700&lt;/P&gt;&lt;P&gt;BIOS Version/Date American Megatrends Inc. 1604, 12/15/2023&lt;/P&gt;&lt;P&gt;BaseBoard Manufacturer ASUSTeK COMPUTER INC.&lt;/P&gt;&lt;P&gt;BaseBoard Product ROG STRIX B760-I GAMING WIFI&lt;/P&gt;&lt;P&gt;Installed Physical Memory (RAM) 32.0 GB&lt;/P&gt;&lt;P&gt;Display Adapter Intel(R) Arc(TM) A750 Graphics&lt;/P&gt;&lt;P&gt;Display Driver Version 31.0.101.5333&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I installed the following Software:&lt;/P&gt;&lt;P&gt;Intel OneAPI Base Toolkit for Windows&amp;nbsp;&lt;SPAN&gt;2024.1.0&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; (includes Intel® oneAPI Math Kernel Library&amp;nbsp;2024.1.0)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I also installed the Intel Extensions for Pytorch from&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;A href="https://developer.intel.com/ipex-whl-stable-xpu" target="_blank" rel="noopener"&gt;https://developer.intel.com/ipex-whl-stable-xpu&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Versions installed are:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;torch&lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt;2.1.0a0+cxx11.abi&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;torchvision&lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt;0.16.0a0+cxx11.abi&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;intel_extension_for_pytorch&lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt;2.1.10+xpu&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;While training on the included file (see attachments) my PC crashed each time I attempted the training. This was a HARD crash, No Windows Blue Screen displayed, PC shutdown, no log files that I could see. CPU/GPU temps were high during training, but not abnormally high. I have my PC set to limit the CPU power to 90C temp and the GPU has a 180W power limit and 85C temp limit set in Intel ARC Control.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I am just learning AI/Computer Vision and this work isn't critical. However, I thought I would report the problem as it may help debug Intel products.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I attached the Python file that reproduced this crash. Rename the .7z file to .py if attempting to reproduce. The Python file is from the Intel Git repo for the pytorch extensions.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 27 Mar 2024 22:16:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Optimized-AI-Frameworks/Hard-crash-using-OneAPI-on-Intel-ARC-A750-to-train-Pytorch-model/m-p/1584029#M490</guid>
      <dc:creator>laduran</dc:creator>
      <dc:date>2024-03-27T22:16:48Z</dc:date>
    </item>
    <item>
      <title>Re: Hard crash using OneAPI on Intel ARC A750 to train Pytorch model</title>
      <link>https://community.intel.com/t5/Intel-Optimized-AI-Frameworks/Hard-crash-using-OneAPI-on-Intel-ARC-A750-to-train-Pytorch-model/m-p/1584364#M491</link>
      <description>&lt;P&gt;I believe the above issue can be ignored.&amp;nbsp;&lt;BR /&gt;&lt;STRONG&gt;I changed the following:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The RAM in my PC was slightly overclocked. I set the RAM back to default settings and set the thermal limit on CPU to 90℃.&lt;/P&gt;&lt;P&gt;The ARC GPU in my system was slightly overclocked as well. I set the overclock settings in ARC Control back to defaults and re-ran the training on RESNET50 and it completed in about 3.5 minutes.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Mar 2024 21:55:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Optimized-AI-Frameworks/Hard-crash-using-OneAPI-on-Intel-ARC-A750-to-train-Pytorch-model/m-p/1584364#M491</guid>
      <dc:creator>laduran</dc:creator>
      <dc:date>2024-03-28T21:55:00Z</dc:date>
    </item>
  </channel>
</rss>

