Intel® NUCs
Support for Intel® NUC products
12528 Discussions

NUC11PAHi7 graphics crashes and hangs

rkiv
Novice
2,766 Views

My NUC11 seems to have significant problems that manifest as crashes and hangs seemingly related to graphics.   System will boot sometimes into the OS (Windows 10 64bit 21H2), run for a while then screen goes black or sometimes all garbled, sometimes with obnoxious audio spewing out as well.   Sometimes screen will go garbled during boot while showing the NUC logo and it also happens while in the BIOS config screen.  From another machine during troubleshooting, I have continuous ping running, and when these events happen, its network dies as well.  Interestingly the fan is still running and I do see activity lights on the LAN connection.   

Other circumstantial weirdness - the realtek driver updates had some weird trouble installing.  Via the Intel Support experience the package said it needed updating, would try to update, report success, then after reboot the page suggested it wasn't updated.   Then I tried to install the direct download and that seemed to have worked but also took a few reboots.

I'm at a loss here.  I have the Driver & Support Assistant report if that's useful(?)

 

Here are some highlights:

 

BIOS Version PATGL357.0042.2021.1213.1702
Date 12/13/2021
SMBIOS Version 3.3

 

graphics:

Version 30.0.101.1191
Date

12/3/2021

 

audio:

intel

Version 10.29.0.6040
Date

7/7/2021

 

realtek

Version 6.0.9247.1
Date 10/5/2021

 

some screen shots of what happens (other than just going black):

rkiv_1-1645901719746.pngrkiv_2-1645901768330.png

 

 

Labels (2)
0 Kudos
31 Replies
rkiv
Novice
2,060 Views

Btw, I should add that I've tried various display connections via HDMI, DisplayPort, USB-C to HDMI connections - all the same.

LeonWaksman
Super User
2,049 Views

I suggest that you should run memory test, using MemTest86

 

Leon

 

rkiv
Novice
1,981 Views

That was a good thought (and I should have thought of that).   But... I'm not sure it's the memory itself.  I have a suspicion there's either something going on with the "upper" sodimm slot, or perhaps the way the system is handling fully loaded 2x32 slots at 3200 speed memory, or maybe a thermal issue (?)

Is there some kind of lower level smbios or SOC logging I could enable to troubleshoot further?  My next step, begrudgingly is to go buy a second set of 2x32 memory to try.

Anyway, here are some interesting test results:

notes

  • I don't think I was experiencing any trouble until I upgraded to the new v42 BIOS and latest graphics driver, etc.  At least I was able to go through Windows install, installed various apps and had several days of light, preliminary usage up to that point.
  • The memory is 2x32GB sticks of Crucial DDR4-3200 (CT2K32G4SFD832A).   
  • I also have a NUC10i7FNH with 2x32GB Samsung DDR4-2666 I used during the sleuthing. 
  • I have the memtest logs from all this if those are useful.  No memtest has revealed a single error though.
  • I haven't yet seen it fail with just one of the Crucial sticks installed, though only in the lower sodimm slot.  It doesn't seem to want to boot if there's only one in the upper slot - is that expected?

mem tests

Test 1 - both Crucial sticks in the NUC11 - system hung about 1.6 hours into first pass during hammer test, and just halted - no memory errors along the way to that point shown in the log. The very last line logged during the test was "2022-02-26 14:34:29 - Current CPU temperature: 57C"

Tests 2&3 - Ran Windows memory diag on each stick individually in the "lower/closer-to-the-board" slot (both passed).  I ran the Windows diag because I had trouble getting machine to boot back into the memtest usb due to screen not displaying the F10 boot menu or hanging trying to get into it.

Test 4 - Not a memtest, but took the Samsung memory from the NUC10 in various configs (1x32, 2x32) and machine seemed to operate just fine, didn't experience any failures after a bit of usage.

Test 5 - memtest86 on 2x32 Crucial, but installed in the NUC10 (had to run overnight).  No errors on 4 passes.  Granted, max speed supported for that machine is 2666, so not sure of test validity (?)

 

 

other random things

  • Running the Samsung 2666 2x32 memory in the NUC11 doesn't seem to manifest the problem - or at least so far.  Swapping the Crucial back in, it ran fine for a while then failed eventually.   Then continued to fail pretty regularly.
  • A few times after failure the machine took forever to post and eventually displayed an error "the system BIOS has detected unsuccessful POST attempt(s). Possible causes include recent changes to BIOS Performance options or recent hardware changes." (press y to enter BIOS setup blah blah)  Pressing Y however wouldn't bring it into the BIOS.
  • I ran the Intel Processor Diagnostic Tool and the thing passed, though I will note I also had HWMonitor running and the CPUs got crazy hot, maxed at 100 C.  I ran the same with the Samsung memory installed and things didn't seem to get as hot.  Could be a red herring, but I've started to wonder if there's some weird thermal issue at play.
  • I had HWiNFO running and logging a few times when the fault happened.  I've attached the log in case that's useful, no idea how to really interpret.
  • side note, probably irrelevant - the Intel NUC Software Studio service logs an "Invalid Query" error into it's event log every 2 seconds continuously.
  • side note, in device manager, the Detection Verification object shows a warning that it failed to load - driver is a Microsoft driver  (WUDFRd.sys) dated recently, 2-18-2022.

 

FWIW, here are the non-Microsoft drivers loaded/installed:

Driver Name Service Name Index File Type Description Version Digital Signature Company Product Name Created Date
e2f68.sys e2fexpress 113 Network Driver Intel(R) Ethernet Adapter NDIS driver 1.0.2.13 INTELEPGSW2022 Intel Corporation Intel(R) Ethernet Adapter 1/20/2021 0:48
gna.sys IntelGNA 103 Unknown Intel (R) GNA device driver (10.64.2.19041) 3.0.0.1400 Intel Corporation Intel Corporation Intel (R) Gaussian & Neural Accelerator 2/25/2022 16:36
iaLPSS2_GPIO2_TGL.sys iaLPSS2_GPIO2_TGL 119 System Driver Intel(R) Serial IO GPIO Driver v2 30.100.2129.8 Intel Corporation Intel Corporation Intel(R) Serial IO Driver 7/19/2021 23:06
iaLPSS2_I2C_TGL.sys iaLPSS2_I2C_TGL 110 System Driver Intel(R) Serial IO I2C Driver v2 30.100.2129.8 Intel Corporation Intel Corporation Intel(R) Serial IO Driver 7/19/2021 23:06
ibtusb.sys ibtusb 138 Unknown Intel(R) Wireless Bluetooth(R) Filter Driver 22.110.2.1 Intel Corporation Intel Corporation Intel(R) Wireless Bluetooth(R) 1/22/2022 13:24
igdkmdn64.sys igfxn 102 Display Driver Intel Graphics Kernel Mode New Driver 30.0.101.1191 Intel Corporation Intel Corporation Intel HD Graphics Drivers for Windows(R) 12/10/2021 11:28
Netwtw10.sys Netwtw10 107 Network Driver Intel® Wireless WiFi Link Driver 22.30.0.11 Intel Wireless Driver Intel Corporation Intel® Wireless WiFi Link Adapter 1/24/2021 4:57
PerformanceDriver.sys PerformanceDriver 124 System Driver Intel(R) NUC Performance Driver 1.0.0.12 Intel Corporation Intel Corporation Intel(R) NUC Performance Driver 7/29/2021 18:22
TbtBusDrv.sys nhi 106 System Driver Thunderbolt(TM) Bus Driver 1.41.1193.0 Intel Corporation Intel Corporation Thunderbolt(TM) Bus Driver 2/18/2022 16:53
TeeDriverW10x64.sys MEIx64 112 System Driver Intel(R) Management Engine Interface 2131.1.4.0 Intel Corporation Intel Corporation Intel(R) Management Engine Interface 8/19/2021 5:30
IntcAudioBus.sys IntcAudioBus 114 Sound Driver Intel® Smart Sound Technology (Intel® SST) Bus 10.30.0.6040 Intel Corporation Intel(R) Corporation Intel® Smart Sound Technology (Intel® SST) Bus 2/26/2022 9:08
IntcOED.sys IntcOED 134 Sound Driver Intel® Smart Sound Technology (Intel® SST) OED 10.30.0.6040 Intel Corporation Intel(R) Corporation Intel® Smart Sound Technology (Intel® SST) OED 2/26/2022 9:08
IntcUSB.sys IntcUSB 137 Sound Driver Intel® Smart Sound Technology (Intel® SST) 10.30.0.6040 Intel Corporation Intel(R) Corporation Intel® Smart Sound Technology (Intel® SST) 2/26/2022 9:08
RtsPer.sys RTSPER 202 System Driver RTS PCIE READER Driver 10.0.19041.21342 Realtek Semiconductor Corp. Realsil Semiconductor Corporation Windows (R) Win 7 DDK driver 5/5/2021 20:10
RTKVHD64.sys IntcAzAudAddService 135 Sound Driver Realtek(r) High Definition Audio Function Driver 6.0.9247.1 Realtek Semiconductor Corp. Realtek Semiconductor Corp. Realtek(r) High Definition Audio Function Driver 2/26/2022 9:08

 

 

n_scott_pearson
Super User Retired Employee
2,014 Views

I agree with Leon; this looks like a memory failure/compatibility issue.

The weirdness you saw with the RealTek driver is perfectly normal. The installer itself uninstalls the old driver, reboots, installs the new driver and then reboots again. Having the driver show up in IDSA again is simply a bug in IDSA, not an indication of a problem or any weirdness (you can hide/ignore the driver installation notification in IDSA).

...S

rkiv
Novice
1,974 Views

Welp, as annoyed as I am, I think you're both right.  I ran a whole day on singles of those 32G sticks, swapped part way - they both seem fine as single sodimms.   I begrudgingly ordered 2x32 3200 (Samsung) yesterday which showed up today (praise be to Amazon one-day delivery) and have been up and running for about an hour.   Someone knock on wood.

 

What kind of compat issue would be at play here (now I'm just curious).  I thought Crucial (Micron) was supposed to be top-end stuff?  One thing I should note is that this all seemingly only started after the BIOS update to v42.  Is it possible it's not the RAM directly but some compat issue created by that update?

n_scott_pearson
Super User Retired Employee
1,965 Views

Well, it could certainly be an incompatibility between this particular memory and the version of the Memory Reference Code (MRC, which handles memory initialization) included in the BIOS. I almost exclusively use Crucial memory myself, so this concerns me as well. I will check with the NUC development team (which I was a member of before I retired) and see if they have any thoughts on this issue.

...S

LeonWaksman
Super User
1,942 Views

FYI, the Crucial DDR4-3200 (CT2K32G4SFD832A) appears on the Compatible memory for NUC11PAHi7. So, could be there is a problem with the specific SO DIMM. 

 

Leon

n_scott_pearson
Super User Retired Employee
1,928 Views

I agree. I looked at the L/L Specs for the CT2K32G4SFD832A and they would appear to be compliant with the requirements laid out for the PA NUCs. This leaves one (or both) of these SODIMMs as being the culprit for this issue. 

...S

rkiv
Novice
1,914 Views

What could I do to further debug?  While I am unblocked using the Samsung sticks, as I mentioned above - both passed all the memory testing I've been able to run - dual installed into a NUC10, one-at-a-time tests in the NUC11.   The machine halts when trying to memtest them both installed on in the 11.   Or perhaps I just try to cut bait and send them back to Newegg.

n_scott_pearson
Super User Retired Employee
1,911 Views

I would simply send them back and get them replaced.

...S

Steven_Intel
Moderator
1,815 Views

Hello rkiv,


Thank you for posting on the Intel® communities.


Based on the recommendations, we would like to know if you need further assistance.


I look forward to hearing from you.


Steven G.

Intel Customer Support Technician.


Steven_Intel
Moderator
1,773 Views

Since we have not heard back from you, we will close this thread. If you need any additional information, please submit a new question, as this thread will no longer be monitored.


Regards,


Steven G.

Intel Customer Support Technician.



rkiv
Novice
1,758 Views

I only just yesterday received the replacement RAM from Newegg and will test with the fresh 2x32 Crucial sticks soon.

gcache
Novice
1,755 Views

Which sticks did you purchase?

rkiv
Novice
1,746 Views

The ones that I'm using currently (and working) are Samsung M471A4G43AB1-CWE.  The Crucial sticks are just a RMA swap of what I had and weren't working, CT2K32G4SFD832A.

gcache
Novice
1,739 Views

Got it, I might have to setup a return/RMA with my crucial sticks. I am having the exact same issue with the same machine as you and same exact sticks.

rkiv
Novice
1,735 Views

let me try to test my new ones today - if they also don't work, then i'd say that's pretty suspect.

rkiv
Novice
1,653 Views

I just tested the new Crucial sticks and they fail as well - machine wouldn't boot and eventually failed with the "detect multiple post attempts" error again.  Either this is extreme bad luck or there is some sort of incompatibility.

 

edit:

@Steven_Intel @n_scott_pearson @LeonWaksman 

drawing your attention to my findings as FYI and because you seemed like experts in-the-know

rkiv
Novice
1,306 Views

FYI, I installed that new BIOS v43 yesterday and swapped/re-tested with the Crucial sticks and they are now working so far.   I just checked now to include the download link from the NUC11PAH download page and it no longer see it listed.  I have no idea what that means, but FWIW, it does seem to be a compat issue probably with the MRC (which was rev'd) and not a problem with the physical memory after all.

 

I do see it listed here, however v42 is advertised as the latest, not v43:

https://www.intel.com/content/www/us/en/download/19694/726361/bios-update-patgl357.html

 

gcache
Novice
1,771 Views

I believe I might be having the same issue here on a different NUC, 2x32 crucial memory. It looks like the resolution is to purchase samsung sticks to replace the crucial ones? I can also install windows to try and run a memtest.

Reply