Intel® ARC™ Graphics
Get answers to your questions or issues when gaming on the world’s best discrete video cards with the latest news surrounding Intel® ARC™ Graphics
1654 Discussions

Report on experiences with Intel Arc A770 freezes on multi GPU system and possible solutions

JanT
Novice
1,379 Views

I bought Intel Arc A770 16GB Limited Edition for experiments with machine learning and OpenVINO. It was a cheaper one that someone used for some time and then returned it to the seller. I experienced a lot of trouble with this card.

 

First I put it to a system with Ryzen 2600X on X470 motherboard that does not support ReBAR. While it is recommended to use ReBAR, I thought that for machine learning I do not care of slower gaming performance because as long as the neural network would fit inside 16GB VRAM it should not pose any problem. And indeed - Stable Diffusion in OpenVINO worked excellently in Jupyter notebook on Windows. Then I tried IPEX (Intel Extension for PyTorch) in WSL and it always lead to BSOD. The computer was also running NVIDIA GTX 1650 Super in second PCIe slot (bifurcation x8/x8) so that it also support CUDA.

 

I moved the Intel Arc A770 to another computer with Ryzen 5600G and B450 chipset that support ReBAR, and IPEX did not crash, Stable Diffusion runs too in OpenVINO. So, ReBAR is MUST for the GPU to work at all, while it should be only optional think that should only speed things up. I also moved INTEL GTX 1650 Super to that computer, but as B450 does not support PCIe bifurcation, it had to go into PCIe 2.0 x4 slot. The computer also has AMD APU integrated on CPU, so A770 runs only in PCIe 3.0 x8 mode. 

 

One of my three monitors have to be connected to NVIDIA GPU as it needs color correcton because it is too redish, but it is only one that can be rotated vertically. Unfortunately, Intel Arc  A770 does not support color correction like NVIDIA.

 

Anyway, I was using the computer and it started to randomly freeze. It was quite terrible, but what helped A LOT was to disable IOMMU in BIOS and setting chipset PCIe devices to version 1.0. It is very interesting, that with IOMMU enabled and when display was connected to Arc A770, even BIOS was very slow and laggy.

 

Computer was randomly freezing too in Windows, but this was mainly resolved by updating A770 drivers. With this setup, it mostly worked, I was even stress-testing it by running Stable Diffusion on Intel card (A770 are good for inference with OpenVINO) and at the same time training Keras neural network on NVIDIA GPU and rendering BMW in Blender using integrated AMD APU.

 

Another set of freezes were when the computer was not being used for some time. Then after starting doing something it has usually frozen. It improved after disable PCIe energy saving in power management in Windows. I also disabled ALL Intel services and Control Center in Windows. This helped quite a lot for some time.

 

But because the old monitor is connected to older Linux computer via DVI-D, I had to use reduction from HDMI to VGA that is connected to NVIDIA GPU. If the computer had two NVIDIAs GPUs, it would not be causing problems, but with Arc A770 it was causing random blanks and freezes. Yes, badly connected monitor to NVIDIA card with combination of A770 causes windows to freeze. I replaced the reduction to DisplayPort to VGA and freezes stopped for several days, but the active reduction failed after several days and I had to RMA it and so had to return to HDMI to VGA reduction, and freezes and BSODs started again.

 

So I found another article (https://www.pugetsystems.com/support/guides/updated-watchdog-dpc/) that I need to enable MSI (Message Signaled-based Interrupts) mode in Windows registry. It was not enabled for the NVIDIA card, so I enabled and checked that IRQ is negative in system settigs. While in the morning I had two BSODs, now I am working for several hours without BSOD again. It is very weird, that problematic monitor reduction connected to another GPU can cause BSODs when the computer also contains A770. But enabling MSI on ANOTHER device seems to be able to prevent A770 from crashing. People experiences freezes maybe should install chipset drivers directly from Intel or AMD, or manually enable MSI on stuff like their NIC or sound card...

 

My setup with three GPUs is not standard, but I hope it will continue to be supported and stuff will improve over time. Now I am waiting for another freeze. It is weird that since I added A770 to the computer, it has days when it freezes or BSODs several times a day and then it works fine for some time, usually after some intervention found on some forum or article. But did it always helped or not? Weird stuff...

2 Replies
JanT
Novice
1,219 Views

OK, several days without freeze, seems that the enabling MSI on NVIDIA definitely helps - maybe people experiencing freezes might have some another device that needs MSI enabled.

 

More here: https://community.intel.com/t5/Graphics/Multi-GPU-setup-possible-solution-to-freezing/m-p/1509220#M120557

zenon
Novice
830 Views

Are there any downsides to the Nvidia card activating MSI?

0 Kudos
Reply