I'm recreating this thread here.
Basically, an NUC10i7FNH1 (brand new) is shutting off and throwing the over temperature warning, but it's not actually overheating as far as I can tell, and this happens randomly (often overnight when the system is idle). Fan appears to work fine and does remove heat from the system, temperatures are normal, BIOS is default but with "Cool" fan profile, Windows 10 is a fresh install, all drivers are installed and up to date. Actual stress testing with Prime95 shows no issues.
We've currently set up another NUC10i7FNH1 to test with, and I'm logging temperatures and waiting to see if this one will shutoff and show the over temperature warning.
I'm also having a terrible time trying to find the latest BIOS for these NUCs. 0039 was the latest, then 0041 was posted, then it was pulled, then reposted, then 0043 was posted, then 0043 was pulled. 0039 is now the "latest" again, and even the patch notes for 0041 and 0043 have been pulled down. No indication as to what's going on here with these BIOS versions.
We have a new NUC10i7 (FNH1). The user reports that it frequently shuts off, and upon rebooting displays the following message:
WARNING: System has recovered from an over-temperature condition. Please ensure proper airflow before continuing.
Press the Enter key to continue
We have been able to replicate this issue. Sometimes it occurs when the system is under load, some times it occurs when the system is idle. The system is running Windows 10 1909. The system has the latest drivers and the March BIOS (0039).
BIOS defaults have been loaded, and the system cooling policy set to "Cool".
The system has been reformatted and had a clean install of Windows 10.
The system has undergone stress testing with Prime95 loading all cores (with a blended CPU/memory workload) as well as AS-SSD running an extended benchmark on the SSD. During stress testing, we were unable to get the system temperature to rise enough to trigger the thermal shutdown event.
There is an updated BIOS (0041) that was posted about a week ago. (The date says May, but it wasn't available online 2 weeks ago, but was available last week.) We have not tried this BIOS as the change log does not mention anything related to power or thermals, and multiple users here have reported issues with this BIOS making their NUC unable to boot at all.
We ran temperature logging via HWInfo, and there is nothing unusual around the time the machine last shut off. Throughout several days of logging, we have observed the following:
The system generally remains at the same power usage when idle, and there is a small rise in power usage when the system is in use. Occasionally, one core will turbo briefly, to around 92 degrees, then back off. I believe this is the expected turbo behavior of this CPU. This happens many times in the log without issue.
My only guess is that sometimes the turbo behavior goes too far, or the sensor misreads the data, and we get a spike of 100 degrees, which appears to be the threshold for triggering thermal shutoff. However, I have not been able to log such a spike (I assume that I would be unable to because the system would shut off instantly).
Other things to note:
When system power usage rises for an extended period (when the user is doing something), we see a corresponding rise in CPU core temps as well as fan exhaust temp. This indicates to me that the cooling system is working.
Average / overall temperatures are fine (40-50 degrees).
SSD temperatures don't get very high either.
The user isn't doing anything intensive. They basically run Chrome, Word, and Outlook.
The system seems to shut off randomly. We've seen it happen while it was under load (processing Windows Updates), and we've seen it happen when the system was idle over night.
Does anyone have any idea what could be causing this? I'll note again that this has happened on a fresh install of Windows 10 1909 with all updates and drivers.
We have 2 other NUC10i7FNH1 units that we have not yet deployed. We will be testing these extensively.
However, if this is a common occurrence (we've seen similar behavior, and outright bricking, with NUC7s and more so with NUC8s), we'll probably need to shift our 100+ systems away from NUCs and onto something more reliable.
I'll run the Intel SSU when I get a chance, though because everyone is working remotely and the user is using it, it may take a while before I get an opportunity to do so.
This is the only NUC10 unit that has this behavior so far. We bought 3 NUC10i7FNH units and deployed one to replace an older NUC that was failing.
The other 2 NUC10 units that we have have not been deployed yet. I have one set up now for testing.
I can get pictures of the BIOS screen and temperatures the next time I have physical access, but due to the lockdown this may not be for a while. I can tell you that the temperatures are not high particularly high.
The configuration for the NUC10 that is overheating is: Default BIOS settings except for the fan profile set to "Cool", 8 GB of RAM (I believe) and Windows 10 freshly installed on an SSD (Samsung 970 Evo I think).
The test NUC has BIOS defaults and a fresh Windows 10 installation on an SSD (probably a Samsung 970 Evo or similar). The test NUC is running BIOS 0041, I believe. The failing one was running 0039 and was updated to 0043, I believe, on Friday. See below for further details on that mess.
On Friday, I had a chance to get on to the failing unit briefly and reverify that all drivers were up to date (using the Intel driver utility) and update the BIOS to 0043, I believe.
Previously, 0041 was listed as the latest version, but I held off on updating to it because I had heard reports here of users having issues with it causing Windows to not boot.
On Friday, I checked and saw 0043 was recently posted, so I installed 0043.
However, I now see that both 0041 and 0043 have been pulled and 0039 is again listed as the latest!
What happened to 0041, which was dated May released (or re-released) in June?
What happened to 0043, which was dated June and released on 6/15 or shortly after?
Your own link to 0041, which was the latest version at the time of your post, now redirects back to the 0039 version from March. https://downloadcenter.intel.com/download/29631/BIOS-Update-FNCML357-?v=t
This makes it almost impossible to determine what version of the BIOS I should be running, and to properly test. Even the patch notes for 0041 have been pulled from the Intel site. https://downloadmirror.intel.com/29631/eng/FN_0041_ReleaseNotes.pdf
I don't have a link to the 0043 patch notes, but I assume those have been pulled from the site as well.
I did read the patch notes for both 0041 and 0043 and neither made any mention of temperature issues, thermal shutdowns, cooling profiles, turbo profiles, etc.
Is there something wrong with these (0041, 0043) BIOS versions? Why have they been pulled?
In the meantime, I'll continue testing on the test unit I have set up. I'm basically running extended stress tests for multi core and single core workloads (using Prime95) and trying to get it to overheat and shutdown. I'll also enable temperature logging via HWInfo and leave it idle over several days to see what, if anything, happens.
BIOSs can be pulled for any number of possible reasons: causing bricking, causing abnormal operation, security issues, etc. Intel does not normally detail why a particular BIOS has been pulled.
Whatever the reasons for pulling them down, I have 3 NUC10s running 3 different BIOS versions - 0039 (shipping BIOS), 0041, and 0043. Is my only option to check the download site daily and hope they post a new BIOS eventually and hope it isn't deemed unfit and pulled a few days later?
Are you sure it was 42 and not 41 or 43? I never saw a 42 posted. Do you happen to recall what the release notes for that version were?
It sure would be nice if Intel could be at least a little transparent about what's going on with their new NUC line and the BIOS. We have well over a hundred NUCs deployed across various offices, and we're set to replace many of the older (NUC7i7 and 7i5) units.
yes 42, see attachment
i guess if there is a real problem with the withdrawn versions, Intel would mention it.
I dont know if should make to much minds but its weired und since i am using NUCs (5-6 years) i have never had such a strange thing.
So i would recommend you to get in touch with the support as you are business user and if they suggest to recover to current version.
Similar situation from my NUC10i7FNB. Have not changed a single BIOS setting and regularly leave it on when finished for the day. When starting the next day consistently get temperature warning on startup after it has shut down over night. This is a brand new system 1 month old and has been doing it since day one. BIOS updates have not fixed it. Running Intel upgrade software to keep updates coming.
See attached spreadsheet after running Core Temp for the last few days. Left computer around 8pm. At 23:25 temperatures spiked with core load at 0 and system appears to have shut down. 08:38 next day press power button to get temperature warning (see expanded Core Temp report attached).
First of all, no two people's problems are exactly the same. There are plenty of differences between your two machines and their build (especially software load, order of installation, etc.). Secondly, separate reports, even just two instead of one, will have a higher impact and this a higher priority. It's a statistics game!
We've been able to physically swap the affected unit out with another unit and do more testing.
I can reliably recreate the shutdown and over-temperature message within 30 seconds. I have a support case open regarding this, and am awaiting Intel to test on another unit.
If I run a CPU stress test that runs across all cores then the system is fine. It heats up, the fan spins up, and it just stays running until I stop the test. If I run a CPU stress test that runs across one physical core, then the system will shut down within 30 seconds and display the over-temperature message.
Another identical system I tested with did not have this issue.
The specific test I'm using is Prime95. I run the Small FFTs test across 12 threads when stress testing all cores. I run the Small FFTs test across 2 threads when stress testing 1 core. My thinking here is that there's an issue related to the single core turbo behavior. When all cores are working, the power budget is spread across the whole CPU and no individual core gets too hot (they seem to peak at around 80 degrees on this particular unit). When just 1 core is working and isn't constrained by a power limitation, that core will spike to almost 100 degrees.
I believe 100 degrees is the thermal cutoff point. My guess is either some temperature sensor is faulty, the BIOS/firmware that controls the turbo behavior is too aggressive, or some mix of the two. I've already verified that the RAM, SSD, and Windows installation are good.
I've tested 2 NUC10i7FNH units, and 1 has this problem, 1 is fine. We have a third unit, but we haven't tested it yet.