Intel® ARC™ Graphics
Get answers to your questions or issues when gaming on the world’s best discrete video cards with the latest news surrounding Intel® ARC™ Graphics
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.
3442 Discussions

Random Freezes with Arc Series GPUs Thread

Saveno
New Contributor I
15,629 Views

First off, i'm not looking for Support here. This Thread is supposed to find a solution for the common Freezing Problem a lot of people are having. Here are more than 20 posts about the issue, with many people reporting to have the same issues in the threads comments. There are many more reports in other places like Microsoft or Tech Forums but guess that should be enough for now.

Arc Issues - Pastebin.com

 

The Problem

Sometimes, out of nowhere when the Arc is under load, the PC just completely freezes. You sometimes still hear a buzzing sound from your headphones and a few seconds later your system restarts. The weird thing is, however, that nothing seem to trigger it specifically. For example:

I can play a game like Black Desert on max Settings and nothing happens for a whole day. The next day when i play the freeze happens 1-2 times, then its fine for the rest for the day. Same with AI Tools such as Stable Diffusion, which kinda has become my "Crash Benchmark" for the Arc. On some days the Freeze and Restart happens after generating 2 or 3 images, and sometimes i can go out, let SD generate 100+ images, come back a few hours later and the system is still running.

This issue seem to happen with any Arc Series Card, which implies a Driver Problem.

 

The Hardware

Since dozens of people have reported the same phenomenon and some posted their hardware there is nothing people have in common here either. Some people have Intel CPUs, some AMD ones, some high-end hardware, others low-mid hardware, people have different type of ram sticks, different mainboards, different PSUs etc.

 

What has been tried?

The Intel Support gives the usual advices. Uninstall Drivers via DDU, Update or/and Reset Bios, enable ReBar, reinstall Windows, try a different PSU, but the result is always the same. It does not fix the problem. Everyone also reports that this does not happen when they use their old or different GPUs. Old drivers are not helping either, neither does Undervolting or Underclocking. Overheating is not an issue either since my Arc never goes above like 75° under max load.

 

Error Logs

Due to the instant crash, Windows sadly does not manage to create Minidumps and the Event manager reports nothing but an "unexpected system shutdown (Event-ID 41)". I was lucky once tho and my system did not instantly restart and actually created a Minidump, which is not that interesting tho. It only reported that the GPU Driver has crashed.

SYMBOL_NAME:  igdkmdnd64+157e0

MODULE_NAME: igdkmdnd64

IMAGE_NAME:  igdkmdnd64.sys

STACK_COMMAND:  .cxr; .ecxr ; kb

FAILURE_BUCKET_ID:  0x116_IMAGE_igdkmdnd64.sys

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {7eb0cc99-c85b-4092-6430-8f1db059b7c1}

Followup:     MachineOwner

 

Since the Intel Support is not really helpful in this matter (nothing against you guys, i know you doing what you can) maybe we can gather Data over the next months or so that helps solving this problem via a Driver update.

Also a small tip, if i press Win + Ctrl + Shift + B fast enough to restart the Intel GPU Driver it sometimes happens that my System does not shut down and actually recovers. You have to do that the moment you notice your PC is about to freeze however. 

 

Also made a Post on Reddit about it here since Reddit is more active: Random Freezes with Arc Series GPUs Thread : IntelArc (reddit.com)

49 Replies
Jose_Intel
Employee
8,909 Views

Hello @Saveno

 

Thank you for posting on the Intel️® communities.   

 

I am sorry to hear that you have problems with Intel® Arc™ Dedicated Graphics Family, and I’ll be more than happy to help you.

 

Thank you for taking the time to share that useful information with us. Please let us check this internally, as soon as we have an update, we will post it here.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Jose_Intel
Employee
8,880 Views

Hello everyone!

 

We already submitted the feedback provided, and we will work on it. If more people can share more information here in this thread, it would be appreciated.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
8,870 Views

Thank you for acknowledging the issue. There is activity on the Reddit Thread i made about the issue, but also nothing that helps find out what the issue is. I've gone through pretty much all these posts and as i said Hardware-wise people have nothing in common. The only thing everyone has in common is that it only crashes under like max load in Triple AAA Games for example but it doesn't seem to be a power issue since it also even happens when you undervolt and underwatt the GPU. And since the crashes are random and sometimes don't happen for days it feels impossible to figure out the issue. Maybe we need some sort of debug tool that logs what the GPU is doing to determine what caused the crash in the end. 

I guess i will just let GPU-Z running in the background for a week or so since it has a "Log to file" option that monitors the GPUs Temp, Mhz usage etc.

Maybe these logs will have something in common before the crashes happen. 

 

/edit

GPU-Z log doesnt show anything suspicious either. It create a log every second but doesn't log anything unusual before the crash. Last night i did a "stress test" and generated 100 images via Stable Diffusion over 3 hours where the Arc was under max load and nothing happened and everything was stable. Today i started my PC, tried to repeat the "stress test" and got 2 crashes (since then everything is stable again)

 

0 Kudos
Yoel-S
Beginner
8,838 Views
Having the same issue on 2 different laptops, mostly when using photoshop, I’m going crazy, I have tried all troubleshooting, intel needs to fix this ASAP please!!!!!!!!!!
0 Kudos
Saveno
New Contributor I
8,817 Views

Just a quick update. After following the advice of someone on Reddit to add a TDRdelay entry to the registry, my screen now only turns black for a few seconds but then the driver seems to recover and the screen turns on again. Before this has always result in an instant pc reboot but the registry trick seems to give the driver enough time to recover from its crash. Can of course again just be luck though. If anyone want to try, just look up a youtube video about TDRdelay. I have put my delay to 60 seconds.

A PC Restart still seems to be required tho, for Stable Diffusion at least, as it keeps saying that the GPU has been suspended even after restarting the program. Next time this happens i will try if games are running fine after the drivers recovered from the crash. 

As a result, the Event Viewer now also gives different Warnings than before.

Warning 1

Display driver igfxnd stopped responding and has successfully recovered.

-System
  
-Provider
   [ Name]Display
  
-EventID4101
   [ Qualifiers]0
  
 Version0
  
 Level3
  
 Task0
  
 Opcode0
  
 Keywords0x80000000000000
  
-TimeCreated
   [ SystemTime]2023-07-28T10:31:11.2188246Z
  
 EventRecordID83097
  
 Correlation
  
-Execution
   [ ProcessID]11024
   [ ThreadID]0
  
 ChannelSystem
  
 ComputerPC
  
 Security
-EventData
   igfxnd

 

Warning 2

The Intel(R) Graphics System Controller Firmware Interface is being reset.

-System
  
-Provider
   [ Name]GSCx64
  
-EventID1
   [ Qualifiers]32775
  
 Version0
  
 Level3
  
 Task0
  
 Opcode0
  
 Keywords0x80000000000000
  
-TimeCreated
   [ SystemTime]2023-07-28T10:31:12.5891111Z
  
 EventRecordID83098
  
 Correlation
  
-Execution
   [ ProcessID]4
   [ ThreadID]7984
  
 ChannelSystem
  
 ComputerPC
  
 Security
-EventData
    
   00000000010000000000000001000780000000000000000000000000000000000000000000000000

 

Don't know if its useful, just trying to post as much info as possible. 

0 Kudos
Jose_Intel
Employee
8,807 Views

Hello everyone!

 

Thank you for that important information, we will continue working on it. Thank you for your patience.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Yoel-S
Beginner
8,798 Views
Have you found the issue? When is fix planned?
0 Kudos
Saveno
New Contributor I
8,793 Views

Unlikely that they find the issue unless they manage to replicate the issue on one of their systems which does not seem to be the case considering that this is an issue people already reported last year. If all their systems are running flawlessly, its probably almost impossible to figure out what the problem is. Someone with the issue probably has to send them their whole system to figure out whats wrong lol. 

 

Also another small update.

Looks like i was just lucky that the driver recovered yesterday. Today the crash restarted my system again. I've noticed that i can trigger the crash by putting the Arc under max load when i start my PC for the first time in the morning. So i started my PC, instantly opened Stable Diffusion, generated an image and my system crashed. Its almost like that the Arc doesn't like being put under load when its "cool" and just "woke up" or was in Idle. Now after the crash everything is stable again. Looks like my solution will be to crash my system with Stable Diffusion on purpose after starting my PC so its stable through the rest of the day lol.

0 Kudos
Saveno
New Contributor I
8,731 Views

Another small update, even though im probably just having a few lucky days since technically this fix doesn't make much sense to me. 

Ever since setting Precision Boost in Bios to "Eco" so my CPU does no longer use more than 40 Watt i didn't have a single crash. As i said before, when i generated something in Stable Diffusion right after starting my PC, it was always 100% system crash, but for 2 days now i didn't have a single crash. I played AAA games all over the weekend, generated lots of images and nothing happened.

However, i'm sceptical for now because it doesn't really make much sense that this fixes the problem because Stable Diffusion does not put any load on the CPU at all and the crashes only happen when the Arc is under 100% load. Like I'm testing it right now, and SD uses 0% of the CPU in Task Manager, but 100% of the GPU. The games i play are also not CPU heavy at all.

I mean i don't know what happens "behind the scenes", but i assume that even if the CPU is not under any load it works together with the Arc in one way or another and there is probably more to the Eco Mode than just lower Watt usage. If this is "the fix for me" it sucks that i lose some CPU performance, but since i'm not using any programs or play any games that are CPU heavy it should be fine.

I will test it a few more days and if the crashes are really gone now, i will get in touch with some redditors that reported the same problem in my post to see if it solves the problem for them as well. Will report back if the crashes happen again or after i heard back from other people affected by the problem to see if this is a potential solution to the problem. 

0 Kudos
Jose_Intel
Employee
8,711 Views

Hello Saveno!

 

Thank you for providing that information.

 

We are more than willing to look for a solution here. So, in order to have more information about your system, please download, install and run Intel® System Support Utility for Windows. Make sure you check “Everything” before you scan, then save the report and attach it to your response.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
8,673 Views

As i said it seems that i have found a temporary fix for now. I didn't experience any crashes in the past days after putting Precision Boost Overdrive into Eco Mode which limits my CPU to 45 Watt and guess it limits other things too. It's a weird solution considering my PC only crashed when the Arc was under heavy load but if it works it works. I'm sadly not smart enough to see the connection here, since the crashed also happened in Stable Diffusion which used 100% of the GPU, but maybe max. 1% of the CPU but i guess some things are happening in the background and the Arc puts the CPU under load even if Task Manager does not show it. 

It's of course not the solution i was hoping for, as i lose some CPU performance, but i'm not seeing a magnificent drop in FPS or performance in games so its fine for now.

I have attached the report anyway though. 

0 Kudos
Jose_Intel
Employee
8,651 Views

Hello Saveno!

 

Thank you for the information.

 

I noticed that you are currently using the Beta version of the graphics driver. Could you please try our latest version 31.0.101.4577? After installing that, please revert the changes in BIOS, just to see if that makes a difference.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
8,607 Views

I've installed the new Beta Drivers for Baldurs Gate from yesterday and reverted the change in Bios. Will report back if my system starts crashing again. If it doesn't till Monday i will let you know too. 

/edit

System Crashed again. Going back to Eco Mode.

0 Kudos
Jose_Intel
Employee
8,601 Views

Hello Saveno

 

I am sorry to hear that the issue persisted.

 

Let us check it internally, as soon as we have an update we will post it here.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
8,577 Views

Alright, crashes started happening again. Worked fine for a week since putting the CPU into Eco Mode, now they randomly back even though i didn't change anything. Returning the card to have it checked is not an option either as others with the same issue already reported that it doesn't fix the crashes, even after they got a replacement. I don't see any other option amyore than selling my Arc, take the loss and get another card instead.

0 Kudos
Jose_Intel
Employee
8,513 Views

Hello Saveno

 

Thank you so much for the information and the feedback, we really want to get this fixed. We'll keep the thread open so more customers can chime-in.

 

App freezes is a broad statement that can occur due to any hardware malfunction (CPU, GPU, RAM, etc) or software (OS, drivers, app code) so we'll continue working on the issues reported per app.

 

For now, let’s focus on Stable Diffusion which is one of the tools that has been triggered the crashes. Could share some steps for us to try replicate the issue? It would be very helpful.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
8,489 Views

I'm using the DirectML Version from lshqqytiger/stable-diffusion-webui-directml: Stable Diffusion web UI (github.com)

 

It happens with any simple generations like

512x512 and Hires fix Upscale x2.

Here 2 events can occur.

I'm just getting a black screen for 2-3 seconds and after Stable Diffusion tells me that the GPU has been suspended and the program does not work any more. It continues giving me the error even after restarting the program until i restart my Computer. Games and everything else still works fine however after the Intel GPU Driver recovered. 

or

My System freezes and hard resets. 

Often happens 1-2 times after starting the PC for the first time on that day, then its usually fine for the rest of the day. It also happens randomly sometimes though, but then it also only happens 1-2 times and then its fine again for hours, sometimes even days again.

 

But as i said, it happens with any GPU-demanding application, so mostly in AAA Games(Cyberpunk, Hogwarts Legacy...), Games with High Resolutions like Diablo 4, even happens in World of Warcraft when everything is maxed out and Render Scaling is put to 200%. I had multiple crashes in pretty much any game i have played over the past months. The crashes are much rarer when you put down your Graphic Settings or play with max. 60 FPS. 

Doubt its any of my other hardware, these issues did not occur with my old GPU and it only happens when the Arc is under load and many people said the same. I like using Stable Diffusion as an example here because it does not use your CPU at all, but 100% of the GPU so it's a good way to show that the Arc is the problem here. Some even made use of their warranty already, got a new Arc and the crashes still happened. 

0 Kudos
Yoel-S
Beginner
8,509 Views

Unfortunately, I had to turn off the ARC graphics card in the device manager and return to Driver 30.XXXX in order to stop crashing and have clear fonts in Photoshop, this is very sad for a brand new expensive laptop

0 Kudos
Jose_Intel
Employee
8,476 Views

Hello Saveno

 

We will check the information internally, as soon as we have any update we will let you know by posting here.

 

Best regards,

Jose B.

Intel Customer Support Technician


0 Kudos
Saveno
New Contributor I
8,362 Views

Thank you for taking the issue seriously and don't just say "make use if warranty" or something. Its obviously a widespread issue and i see more and more posts about it popping up. I also see a lot of threads where people think that its Games XYZ fault that their computer froze and crashed, but im sure its fundamentally just the same problem. For example:

Issues with Intel ARC 750 screen glitching and freezing after running Baldurs Gate 3 - Intel Community

Atomic Heart Annihilation Instinct crash 0x00000116 ARC 770 8GB - Intel Community

Screen flashes black in Chrome/Edge - Intel Community

I sometimes have those short black screens too btw. when i just use a browser and watch a video or something. The Screen turns black for a second but then comes back. It's probably in one way or another connected to the freeze problem too. It kinda feels like that the Arc loses power for a second due to a driver crash but when its not under heavy load it manages to recover. If its under heavy load however it doesnt manage to get back from 0 to 100 and crashes the system instead. 

0 Kudos
Reply