Intel® ARC™ Graphics
Get answers to your questions or issues when gaming on the world’s best discrete video cards with the latest news surrounding Intel® ARC™ Graphics
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.
3446 Discussions

Random Freezes with Arc Series GPUs Thread

Saveno
New Contributor I
15,651 Views

First off, i'm not looking for Support here. This Thread is supposed to find a solution for the common Freezing Problem a lot of people are having. Here are more than 20 posts about the issue, with many people reporting to have the same issues in the threads comments. There are many more reports in other places like Microsoft or Tech Forums but guess that should be enough for now.

Arc Issues - Pastebin.com

 

The Problem

Sometimes, out of nowhere when the Arc is under load, the PC just completely freezes. You sometimes still hear a buzzing sound from your headphones and a few seconds later your system restarts. The weird thing is, however, that nothing seem to trigger it specifically. For example:

I can play a game like Black Desert on max Settings and nothing happens for a whole day. The next day when i play the freeze happens 1-2 times, then its fine for the rest for the day. Same with AI Tools such as Stable Diffusion, which kinda has become my "Crash Benchmark" for the Arc. On some days the Freeze and Restart happens after generating 2 or 3 images, and sometimes i can go out, let SD generate 100+ images, come back a few hours later and the system is still running.

This issue seem to happen with any Arc Series Card, which implies a Driver Problem.

 

The Hardware

Since dozens of people have reported the same phenomenon and some posted their hardware there is nothing people have in common here either. Some people have Intel CPUs, some AMD ones, some high-end hardware, others low-mid hardware, people have different type of ram sticks, different mainboards, different PSUs etc.

 

What has been tried?

The Intel Support gives the usual advices. Uninstall Drivers via DDU, Update or/and Reset Bios, enable ReBar, reinstall Windows, try a different PSU, but the result is always the same. It does not fix the problem. Everyone also reports that this does not happen when they use their old or different GPUs. Old drivers are not helping either, neither does Undervolting or Underclocking. Overheating is not an issue either since my Arc never goes above like 75° under max load.

 

Error Logs

Due to the instant crash, Windows sadly does not manage to create Minidumps and the Event manager reports nothing but an "unexpected system shutdown (Event-ID 41)". I was lucky once tho and my system did not instantly restart and actually created a Minidump, which is not that interesting tho. It only reported that the GPU Driver has crashed.

SYMBOL_NAME:  igdkmdnd64+157e0

MODULE_NAME: igdkmdnd64

IMAGE_NAME:  igdkmdnd64.sys

STACK_COMMAND:  .cxr; .ecxr ; kb

FAILURE_BUCKET_ID:  0x116_IMAGE_igdkmdnd64.sys

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {7eb0cc99-c85b-4092-6430-8f1db059b7c1}

Followup:     MachineOwner

 

Since the Intel Support is not really helpful in this matter (nothing against you guys, i know you doing what you can) maybe we can gather Data over the next months or so that helps solving this problem via a Driver update.

Also a small tip, if i press Win + Ctrl + Shift + B fast enough to restart the Intel GPU Driver it sometimes happens that my System does not shut down and actually recovers. You have to do that the moment you notice your PC is about to freeze however. 

 

Also made a Post on Reddit about it here since Reddit is more active: Random Freezes with Arc Series GPUs Thread : IntelArc (reddit.com)

49 Replies
Jose_Intel
Employee
4,597 Views

Hello Saveno

 

That information also works, again thank you for your time and feedback.

 

We will continue with our testing.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Andres_Intel
Employee
4,497 Views

Hello Saveno,

 

 

Thank you for your time.


We have been working on the investigation, please follow the steps below and let me know the results:


 

  

Regards,  

 

Andres P. 

Intel Customer Support Technician 


0 Kudos
Saveno
New Contributor I
4,489 Views

Thanks, ive already installed the new Driver and didn't experience a crash today. I will see if it stays this way for a week and then get back to you (or as soon as my system crashes again)

0 Kudos
Andres_Intel
Employee
4,470 Views

Hello Saveno,

 


Thank you for keeping me informed.


I will wait for you to test this driver, remember to collect crash dump files to analyze them if the issue persists.

  


Regards,


Andres P.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
4,456 Views

My PC Crashed, but seems like i'm getting actual Bluescreens how since the last driver instead of just a hard reset. I've attached the dumbfile. 

 

/edit

The Bluescreen might be unrelated to the problem. It happens when i close the new game "Gord". The Bluescreen happened twice after i exited the game. 

0 Kudos
Andres_Intel
Employee
4,427 Views

Hello Saveno,

 


Thank you for your response, and following all the steps.


I will continue with the investigation, as soon I have further information I will et you know.

  


Regards,


Andres P.

Intel Customer Support Technician


0 Kudos
Andres_Intel
Employee
4,413 Views

Thank you for your time. 



I have been working on the investigation and now I have a couple of questions for clarification:


  • Is the issue only happening with GORD and not as it was happening before? I did not quite understand if you meant the computer only crashes when closing the game or if it is behaving as before plus it has the issue with the game.
  • Is the issue with Diablo 4 no longer present?


  

Regards,  

 

Andres P. 

Intel Customer Support Technician 


0 Kudos
Saveno
New Contributor I
4,407 Views

Sorry for the confusion. I downloaded Gord yesterday and it gave me Bluescreens twice after closing it so i don't believe this is related to the problem. Like once i tried to save the game, it gave me some Error and after closing it i got a Bluescreen. I don't think it is related to the problem reported here at all so sorry for the misunderstanding. 

So far i did not experience any other System Crashes in other Games or in Stable Diffusion since the Driver Update. 

0 Kudos
Saveno
New Contributor I
4,335 Views

Small Update. Still no System crashes since the Driver Update, the Driver itself still seems to be crashing however. When this happens i also get a weird Error in Edge. After the Driver Crash when i try to open sites like Youtube it blocks the page with the Error Status_access_violation

 

I also got a similar sounding Error in Cyberpunk after it crashed.

Error reason: Unhandled exception
Expression: EXCEPTION_ILLEGAL_INSTRUCTION (0xC000001D)
Message: The thread tried to execute an invalid instruction.

and in Stable Diffusion as well

The GPU will not respond to more commands, most likely because some other application submitted invalid commands.

 

To fix the error in Edge i have to restart my PC. Cyberpunk no longer starts after the driver crash either and needs a PC restart as well.

0 Kudos
Andres_Intel
Employee
4,323 Views

Hello Saveno,

 

 

Thank you for your answer, and clarification, it has been really helpful.


Happy to hear that you are not experiencing any other System Crashes in other Games or in Stable Diffusion since the Driver Update.


Related to the GORD Bluescreens and Cyberpunk issues I will continue with the investigation to provide you with the next steps.

 

  

Regards,  

 

Andres P. 

Intel Customer Support Technician


0 Kudos
Andres_Intel
Employee
4,307 Views

Hello Saveno,

 


Thank you for your time.


I have been working on the investigation, just to clarify, it seems that the original issue of the card randomly freezing is already resolved correct? If this is the case please open another thread, so we can keep things organized and help you with each issue.


Please keep me informed if I am correct.

  


Regards,


Andres P.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
4,301 Views

As i said the driver still seems to crash, but it just no longer seems to crashe my PC. I usually still have to restart my PC manually when this happens tho or stuff is buggy. I also had days that have been fine before so i rather test a bit longer before i can say for sure that the system crashes have at least stopped.

 

*edit

Just got a Bluescreen and system crash from using Stable Diffusion. A minidump was not created. I will test some other GPU-heavy games like Cyberpunk over the next days and see if stuff crashes there too.

0 Kudos
Andres_Intel
Employee
4,272 Views

Hello Saveno,

 


Thank you for your response and clarification. 


I am sorry to hear that the issue persists, I will wait for you to test it with other games, please let me know which game you have the issue with besides the ones you have mentioned previously.

  


Regards,


Andres P.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
4,263 Views

I tested Final Fantasy 15 today and i got a Bluescreen after a while as well. This time i got a minidumb however which says FAILURE_BUCKET_ID: MEMORY_CORRUPTION_ONE_BIT again. So it continues being an overall problem and its not related to any games. As soon as the Arc is under load it randomly crashes at some point. 

0 Kudos
Andres_Intel
Employee
4,248 Views

Hello Saveno,

 


Thank you for completing the steps and keeping me informed.


I will continue with the investigation to provide you with further steps, as soon I have further details I will let you know.

  


Regards,


Andres P.

Intel Customer Support Technician


0 Kudos
Andres_Intel
Employee
4,208 Views

Hello Saveno,

 

 

Thank you for your wait time.



I have been working on the investigation and now I have the following questions:


  • Which device do you use for Stable Diffusion rendering, by default it uses CPU. Also do you OpenVINO for it?

 

  

Regards,  

 

Andres P. 

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
4,183 Views

Well, the GPU of course. And i tried different versions with DirectML, ipex and lately OpenVINO. Happens on all of them. Also got the Error again, this time in World of Warcraft. Never had this before and i started getting this Error after the latest driver update. Also already tried to reinstall everything via DDU but it didn't help.

 

Screenshot 2023-08-27 200457.png

Since this sounds more like a Ram error this time i did some memory tests but they didn't return any errors. 

0 Kudos
Andres_Intel
Employee
4,164 Views

Hello Saveno,

 

 

Thank you for your and for letting me know the steps you have taken.


I will continue with the investigation to provide you with the next steps, and as soon I have further information I will let you know

 

  

Regards,  

 

Andres P. 

Intel Customer Support Technician 


0 Kudos
Andres_Intel
Employee
4,086 Views

Hello Saveno,

 

 

Thank you for your time.


I have been investigating about the error message FAILURE_BUCKET_ID: MEMORY_CORRUPTION_ONE_BIT. Most of the time you just update the bios, SATA drivers, firmware of SSD, and check the cables for bad connections and the problem just goes away. Switching the data line to a different controller helps to isolate the location of the problem down to various drivers or bugs in SATA controller hardware.

You can also get this error with certain cache bugs in firmware or hard coded circuits in hard drives. Sometimes you can turn off lazy writing on the drive to reduce the chances of hitting the error if you can not update the firmware or BIOS.


To discard what I just have mentioned, please follow the steps below and let me know the results:


  • Test the system to a minimum state, i.e. only one RAM stick (test both), one SSD (the one with the OS), revert any non-default configuration in the BIOS (except the necessary for Arc, such as ReBAR, ASPM thing, etc) and run the same test with and without the card installed.
  • If possible also try to test the card on another PC.

 

  

Regards,  

 

Andres P. 

Intel Customer Support Technician 


0 Kudos
Saveno
New Contributor I
4,039 Views

Thanks for taking the time looking into the problem. It seems to have solved itself, it at least didn't happen again since i have reported it. About crashes in general, i had a system crash the other day again in World of Warcraft. Besides that things have been stable, but i also didn't have much time to play last week. But things are for sure way more stable than before.

0 Kudos
Andres_Intel
Employee
4,032 Views

Hello Saveno,

 

 

Happy to hear that the issue is fixed, so we will close this thread. If you need any additional information, please submit a new question as this thread will no longer be monitored.


In case the World of Warcraft issue persists, to keep this thread organized and help you in the best way, please open a new thread as well.

 

  

Best regards, 

 

Andres P.   

Intel Customer Support Technician


0 Kudos
Reply