Processors
Intel® Processors, Tools, and Utilities
14690 Discussions

Intel i712700K Freezes/BSODs with WHEA_UNCORRECTABLE_ERROR Code

temurson
Novice
5,418 Views

Hello!

I'm facing freezes/BSODs with WHEA_UNCORRECTABLE_ERROR code. Windows event viewer and core dumps point to processor errors "Cache Hierarchy Error". Any help would be greatly appreciated!

A little background:

I built this system (see specs at the bottom) in May 2022 with "ASRock Z690 PG Riptide Motherboard" and "G.SKILL 2x16GB DDR4 3200 F4-3200C16D-32GVK" and was experiencing intermittent MEMORY_MANAGEMENT BSODs and occasional "GPU device not found" errors while gaming and general usage of the PC. The errors were sporadic, though they seemed to occur after using the PC for a while. I was trying to resolve this, but failed. I blaimed it on bad socket connection for RAM/GPU that only manifested after thermal expansion. Not the wisest solution, but I decided to throw money at the problem and upgrade to DDR5. I got the new motherboard "Gigabyte Z790 AORUS ELITE AX (rev. 1.0)" and new RAM "G.SKILL 2x16GB DDR5 6400 D F5-6400J3239G16GX2-TZ5S". I am now once again experiencing issues, but different, now the system freezes and hard restart is required OR sometimes I get a WHEA_UNCORRECTABLE_ERROR BSOD. I conclude that the two types of errors are the same since both times I see WHEA-Logger event in the Event Viewer with the same message.

I am starting to suspect the CPU to be at fault, hence me posting all of this here.

What I tried so far:

  1. Processor was stress tested using Prime95 for 1 hour with no errors found.
  2. Disabled XMP profile.
  3. RAM was stress tested with XMP enabled using HCI MemTest overnight with no errors found. I was stress testing it while in Windows safe mode to minimize system memory usage, not sure if that matters.
  4. Chipset drivers from motherboard manufacter's website were re-installed, BIOS was upgraded to the latest version (from F2 to F3, see https://www.gigabyte.com/Motherboard/Z790-AORUS-ELITE-AX-rev-10/support#support-dl-bios).
  5. NVIDIA GPU drivers fully uninstalled (using DDU) and installed again.
  6. GPU re-seated, RAM re-seated, CPU re-seated and thermal paste re-applied.
  7. CPU cooler screws loosened per this video https://www.youtube.com/watch?v=oAau2PNjtM0.
  8. SSD and HDD drives tested using Western Digital's diagnostics software.
  9. Reinstalling Windows? When I changed the motherboard and RAM I wiped my SSD drive and did a clean Windows 10 install.
  10. Other attempts of despair, like:
    1. Running with only one monitor
    2. Uninstalling Discord (why not?)
    3. Disconnecting microphone

I am running out of ideas here, I don't want to build a new system from scratch.

 

Specs (Intel System Support Utility report is attached for more details):

OS Windows 10 Pro, Version 10.0.19045 Build 19045

Processor 12th Gen Intel(R) Core(TM) i7-12700K, 3600 Mhz, 12 Core(s), 20 Logical Processor(s)

Motherboard Gigabyte Z790 AORUS ELITE AX

RAM G.SKILL Trident Z5 Series 32GB (2 x 16GB) DDR5 6400 Desktop Memory Model F5-6400J3239G16GX2-TZ5S

Case Fractal Design Meshify C Black ATX

HDD WD Blue 4TB Desktop Hard Disk Drive - 5400 RPM SATA 6Gb/s 256MB Cache 3.5 Inch - WD40EZAZ - OEM

SSD Western Digital WD BLACK SN850 NVMe M.2 2280 2TB PCI-Express 4.0 x4 3D NAND Internal Solid State Drive (SSD) WDS200T1X0E

PSU Seasonic FOCUS GX-1000, 1000W 80+ Gold

GPU EVGA GeForce RTX 3080 FTW3 ULTRA GAMING Video Card, 10G-P5-3897-KL, 10GB GDDR6X, iCX3 Technology, ARGB LED, Metal Backplate, LHR

 

WHEA-Logger event:

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 76

The details view of this entry contains further information.

There are other events that are visible after a crash, can post them if it is helpful.

 

Example minidump:

https://drive.google.com/file/d/1UPcLNfxC_S63jJ9qAhGSxSkeUGhKj5L4

0 Kudos
1 Solution
Alberto_R_Intel
Employee
5,313 Views

Hi temurson, thank you very much for sharing those details.


At this point and based on the fact that, as you mentioned, the processor was tested on a different board and the problem seems to follow the unit, we think a replacement will be the next thing to do. Especially, if you test the memory RAM sticks separately and the problem remains after that. 


By replacing the unit, at least you will be able to find out for sure if the problem is actually the processor or any other component, since it is very unlikely to receive two defective units in a row.


To get in contact with the manufacturer of the board to report this scenario is always a good thing to do, they might have additional details on this matter, maybe reports about this configuration not working properly, a BIOS update to fix it, or a different solution might be available from their side, they even might be able to test your board to discard any problems with it, so that option is something that we always suggest to try as well taking into consideration that you almost try all the troubleshooting steps that we recommend for this scenario.


Regards,

Albert R.


Intel Customer Support Technician



View solution in original post

11 Replies
Alberto_R_Intel
Employee
5,388 Views

temurson, Thank you for posting in the Intel® Communities Support.


For this scenario, it is important to mention that 3-party tools might not be that accurate in the results shown when doing a test on the Intel® processor.


In order to rule out a possible hardware problem with the processor, please install and run the intel Processor Diagnostic Tool, it does an overall test on the unit and if it passes the test it means it is working properly:

https://www.intel.com/content/www/us/en/download/15951/intel-processor-diagnostic-tool.html?wapkw=intel%20processor%20diagnostic%20tool


It is also essential to keep in mind that the memory controller is located on the processor, so, it is the processor the one that determines which type of memory RAM to use. The Intel® Core™ i7-12700K supports Up to DDR5 4800 MT/s / Up to DDR4 3200 MT/s:

https://ark.intel.com/content/www/us/en/ark/products/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz.html


Based on the information shown in the SSU document, the memory RAM you are using is:

"Configured Clock Speed:"6400 MHz" 

"Configured Voltage:"1400 millivolts" 

"Part Number:"F5-6400J3239G16G". 


Using 6400 MHz is higher than what the processor supports, which is DDR5 4800 MHz or DDR4 3200 MHz, and that can be very well the reason why you are seeing the issue since you are forcing the processor to run at speeds it does not support. It might work for a while but sooner or later you might experience system instability and performance degradation on the machine. Based on that what we recommend in this case will be to replace the memory RAM for a model that is within specifications.


In the following links, you will find "How to Resolve Blue Screen Error with WHEA_UNCORRECTABLE_ERROR in Windows" and "Blue Screen Error (BSOD) While Using Intel® Processors" with additional details on this matter:

https://www.intel.com/content/www/us/en/support/articles/000028099/processors/intel-core-processors.html

https://www.intel.com/content/www/us/en/support/articles/000025090/processors.html


We also recommend to get in contact directly with Gigabyte Support to make sure the latest BIOS version is currently installed on your device or to gather the instructions on how to update it:

https://www.gigabyte.com/Support


Any questions, please let me know.


Regards,

Albert R.


Intel Customer Support Technician



temurson
Novice
5,378 Views

Thank you for your insight, Albert! I completely missed that max supported speed on i7 12700K is 4800 MT/s.

I have run the Intel Processor Diagnostic Tool, it did not find any issues with the processor. I have also done everything in the links you posted in one way or another.

I am still getting freezes (just got one when browsing, pretty much idle) with XMP disabled. When XMP is disabled, my RAM speed is 4800 MT/s. I am attaching the SSU report with XMP disabled. Does this not qualify as operating within the supported CPU speeds? I do not have other RAM to test this, and 4800 MT/s is the lowest I see on Newegg for DDR5, so I don't see a point to try and get a different one to test.

I am not sure that this is the processor, but after each freeze the Event Viewer shows WHEA-Logger event with the same Cache Hierarchy Error.

Also, I double checked that I am running the latest BIOS version on my motherboard (https://www.gigabyte.com/Motherboard/Z790-AORUS-ELITE-AX-rev-10/support#support-dl-bios).

0 Kudos
Alberto_R_Intel
Employee
5,355 Views

temurson, Thank you very much for sharing those results.

 

We are sorry to hear the problem still persists after trying the suggestions provided previously.

 

Based on the fact that the processor passed the Intel® PDT test, we can basically rule out any possible hardware issue with it.

 

Another troubleshooting step to try will be to test the computer with just one memory stick at a time, to discard any potential hardware issue with one of the memory sticks or with the memory slot on the board.

 

Just to confirm, do you have the option to test your processor on a different board or test your board with a different processor as described in the link below?

https://www.intel.com/content/www/us/en/support/articles/000057810/processors.html

 

Still, you do have 3 years of warranty on the unit and if you do not have any options for further testing, you can either get in contact directly with the place of purchase and check their warranty policy, most of the times they have a 30-day policy or, get in contact directly with Intel® Support through any of our support channels to claim the warranty on the unit. Taking into consideration that once you received the replacement unit, if the issue remains, then it means the problem is related to a different component:

 

Chat support:

http://intelsupportchat.force.com/icslivechat/ics_tech_processor_ww_english_Chat

 

For phone support, depending on your location, you will see the contact information on the links below:

EMEA contact information: https://www.intel.com/content/www/us/en/support/contact-support/emea-contact.html

APAC contact information: https://www.intel.com/content/www/us/en/support/contact-support/apac-contact.html

LAR contact information: https://www.intel.la/content/www/xl/es/support/contact-support/lar-contact.html

North America: Phone Number 1-916-377-7000, Monday – Friday 7:00 AM to 5:00 PM (Pacific Time).

 

Regards,

Albert R.

 

Intel Customer Support Technician

 

0 Kudos
temurson
Novice
5,339 Views

Thanks for you response, Albert.

Testing with 1 stick of RAM is the only thing I have not tried yet. I will do that and report back.

As outlined in the original post, initially this system was built with ASRock Z690 PG Riptide Motherboard and G.SKILL 2x16GB DDR4 3200 F4-3200C16D-32GVK memory. With that configuration and everything else being the same, the system was not stable, experiencing intermittent MEMORY_MANAGEMENT_ERROR BSODs and occasional crashes in games due to "GPU not detected". So to answer your question on whether I can test this CPU with another board, yes, it has been tested but it was not stable. The errors seem different, but I'm not sure how different and whether the root cause for those errors is the same as for the WHEA errors I'm experiencing now.

I have made an Intel support request at the same moment as I created this post. The support staff suggested the same steps you were suggesting, and they are now offering to replace the CPU by warranty.

To be frank, I don't know how to proceed. As a customer, getting a new CPU by warranty is obviously beneficial, but I have low hopes that it will actually resolve my issues. What would you recommend? Should I try contacting the motherboard manufacturer? So far, all the manuals and support docs for my motherboard have been rather clear that all my components are supported.

0 Kudos
Alberto_R_Intel
Employee
5,314 Views

Hi temurson, thank you very much for sharing those details.


At this point and based on the fact that, as you mentioned, the processor was tested on a different board and the problem seems to follow the unit, we think a replacement will be the next thing to do. Especially, if you test the memory RAM sticks separately and the problem remains after that. 


By replacing the unit, at least you will be able to find out for sure if the problem is actually the processor or any other component, since it is very unlikely to receive two defective units in a row.


To get in contact with the manufacturer of the board to report this scenario is always a good thing to do, they might have additional details on this matter, maybe reports about this configuration not working properly, a BIOS update to fix it, or a different solution might be available from their side, they even might be able to test your board to discard any problems with it, so that option is something that we always suggest to try as well taking into consideration that you almost try all the troubleshooting steps that we recommend for this scenario.


Regards,

Albert R.


Intel Customer Support Technician



temurson
Novice
5,248 Views

Thanks for the advice Albert!

I ended up replacing the CPU through Intel warranty. Surprisingly, the new CPU fixed the issue! I installed the new CPU on Feb 11 in the morning, used my PC normally for 2 days with no crashes. Then I enabled XMP (6400 MT/s) and have not seen any crashes yet, it's been 3 days.

Before replacing the CPU I actually tried running with only 1 stick of RAM with XMP disabled (16 GB, 4800 MT/s). This configuration visibly reduced the amount of crashes, but they still would happen from time to time.

I'm rather surprised that the CPU was the issue as my instincts were telling me it's some kind of incompatibility issue with my motherboard/RAM/CPU combo.

Thank you for your advice and help through this!

0 Kudos
Alberto_R_Intel
Employee
5,239 Views

temurson, You are very welcome, thank you very much for letting us know those updates.


Perfect, excellent, it is great to hear that the problem got fixed after replacing the processor and now the computer is working properly. Even though it is very improbable that a processor might be defective, as you can see, sometimes it happens, we are just glad that the issue got resolved.


Any other inquiries, do not hesitate to contact us again.


Regards,

Albert R.


Intel Customer Support Technician


0 Kudos
KrissyG
New Contributor II
5,234 Views

oh but there is one part you did not test yet, the power supply.
There is a slight chance, it provides sloppy/unstable power to the components, which i once had with a modular 1200W power supply from OCZ.

After that experience i never used anything but Corsiar power supplies. 
So if you have a different power supply, maybe something that is close to 60~80% of max TDP of your system, then i would try that, as well undoing all cables and redoing them, maybe there is a wacky connection.

Edit*

did unsee the problem was solved.

0 Kudos
temurson
Novice
5,232 Views

Hey KrissyG,

I have tried reseating the power supply cables, and it didn't fix my issues. I have a modular 1000W power supply from Seasonic. I don't have a different power supply to test.

As I said in my previous comment, using a different CPU of the same exact model fixed the issue, so I'm rather confident that the CPU was the culprit.

0 Kudos
KrissyG
New Contributor II
5,226 Views

yea i saw the rest of the comments after i posted mine, and that this here got solved,
idk how i did that, but for some reason i did not see the rest of the thread here.

0 Kudos
temurson
Novice
5,220 Views

No worries! Thanks for trying to help, most people would just read the post and move on, so I appreciate your insight!

0 Kudos
Reply