Processors
Intel® Processors, Tools, and Utilities
14509 Discussions

Persistent GCACHEL2_ERR_ERR issues under CPU load

stefanpetronijevic
3,146 Views

Hi,

I've been having issues with WHEA_UNCORRECTABLE_ERROR (124) blue screens for a few months now, which seemed to be random at first, but since it's a recurring issue, it has been causing a lot of lost work, progress, time and nerves, therefore annoying enough for me to troubleshoot it in-depth in attempt to resolve it for good.

Attached are the Intel System Support Utility report and the BSOD log from WinDbg x64.

The system was never overclocked, and it has been only running on stock voltages/frequencies since the day one.

Also, I do not use any PSU cable extenders, and the PSU itself is Seasonic Focus Gold 750W, which was purchased new at the same time, as the rest of the configuration, so it's relatively new, and I have never had any issues with it - there are no voltage drops under load, and it runs perfectly stable.

During the diagnostics process, I've noticed two scenarios which always reproduce the BSOD with the same stack trace in the WinDbg (the one attached).

The first thing that always triggers the BSOD is blend test in prime95 - as soon as the test starts, the system crashes with WHEA_UNCORRECTABLE_ERROR message.

The second thing is running the Intel Processor Diagnostic Tool - all tests are completed successfully, until the CPULoad test starts, which also causes the same BSOD immediately, every time.

This was also tested with WinPE live and Ubuntu Live distro (prime95 only) to eliminate the possibility of this being OS related.

The temperatures during the tests never exceeded 60 C, despite the CPU running on maximum allowed turbo boost frequency, and the voltage was always stable.

I got stuck in the troubleshooting dead end, and I am only left with the presumption that the root cause of the problem is CPU cache, as suggested by the GCACHEL2_ERR_ERR (Proc 0 Bank  error, and that it's always triggered with the above mentioned synthetic tests, as they probably utilize cache a lot, but this is just an educated guess.

I therefore kindly ask for you opinion, are there any further troubleshooting steps that should be taken and also, how can I resolve this problem?

I've been using Intel CPUs for 20+ years, but the architecture complexity has grown beyond my comprehension over the past few generations, and I am left without a clue what could be wrong with all the stuff being integrated into the CPUs nowadays, so your assistance would be very appreciated.

Best regards,
Stefan

Labels (1)
0 Kudos
14 Replies
AlHill
Super User
3,140 Views

I would try different memory.

 

Doc (not an Intel employee or contractor)
[Maybe Windows 12 will be better]

0 Kudos
stefanpetronijevic
3,139 Views

This was also my initial idea, but full MemTest64 finished successfully, with 0 errors.

0 Kudos
AlHill
Super User
3,138 Views

MemTest64 is not absolute.  I would still try different memory.

I would also check to make certain your thermal paste is good, processor fan is working, heat sink is clean, and case has adequate ventilation and fans.

And, I would check for the latest bios.

 

Doc (not an Intel employee or contractor)
[Maybe Windows 12 will be better]

0 Kudos
stefanpetronijevic
3,099 Views

Yeah, the problem still persists with the different RAM sticks.

 

BIOS is already latest F12.

0 Kudos
stefanpetronijevic
3,079 Views

Hi again,

I managed to get the system stable in a sense that there are no more BSODs and did a short blend test in Prime95, and also finished the Intel Processor Diagnostic test successfully.

Attached is the test result and HW monitor log from the testing.

What did the trick was changing the vcore voltage in BIOS from default 1.200V to 1.300V. I probably need to do some fine tuning, as I guess the lower voltages will also work. Everything else is on still on "optimized defaults" value, including the CPU frequency.

I'd really appreciate some feedback from Intel staff, why does the increased voltage resolve the problem? 

Is it a degraded processor which now requires a higher voltage, should I go RMA it? Is it the motherboard, although voltages seem pretty correct, both in BIOS and in HWMonitor during the tests.

Best regards,
Stefan

0 Kudos
DeividA_Intel
Employee
3,049 Views

Hello stefanpetronijevic, 

 

 

I would like to know if your issues persist. Please let me know if you need further support.

 

 

Best regards, 

Deivid A.  

Intel Customer Support Technician 

 

0 Kudos
stefanpetronijevic
3,044 Views
Hi Deivid,

Thanks for the feedback!
The issue still persists, and the only workarond solution that I was able to find requires bumping the vcore voltage from default 1.2V to 1.3V. Note that all other BIOS settings are on optimized defaults, so no overclock.
I would like to hear your opinion what has caused this, and why are the stock voltages no longer enough for stable operation.

Best regards,
Stefan
0 Kudos
DeividA_Intel
Employee
3,025 Views

Hello stefanpetronijevic, 



Thanks for your response. Before we continue further, I would like to confirm the following:


1. When did you notice the issue for the first time? After a system update?

2. Do you see any LED coed from the motherboard when the BSOD appears?

3. Are you using the Intel® Extreme Tuning Utility (Intel® XTU) to change the voltage?

4. Is it possible to get the report from the Intel® Processor Diagnostic Tool when the voltage is at default (1.200V)?

5. Did you use the Intel® Extreme Memory Profile (Intel® XMP) before? If so, which profile/RAM speed did you use?



Best regards, 

Deivid A.  

Intel Customer Support Technician 


0 Kudos
stefanpetronijevic
3,015 Views

Hi Deivid,

 

To answer your questions:

 

1. If I recall correctly, the issue first started a few months ago, but it rarely happened, so I did not pay much attention to it at first. As far as I am aware, it does not seem to be related with the system update. I noticed that the issue started happening more often when I upgraded from Photoshop 2019 to Photoshop 2023, which appears to be noticeably CPU intensive, and the problem became more frequent over the time, and it has been happening randomly on higher loads recently, so much that any work was impossible.

2. No LED codes at the time of BSOD, just the regular startup codes, once the system reboots.

3. No, I have changed the voltage from the BIOS, I have never used Intel XTU.

4. The problem with Intel Processor Diagnostic Tool on default 1.200 V voltage is that BSOD happens every time when the CPULoad test starts, so the test is never completed, and therefore no report is given. BSOD during the CPULoad test happens out of the blue (pun intended), and there are no "Fail" messages shown prior to crashing. I was unable to find any relevant text logs that have higher log verbosity, and more importantly persistent log entries in case of the crash, that would provide additional information what went wrong just before the BSOD. If diagnostic tool does have such logs, please, provide me with the path to log file, and/or if debug log level needs to be configured, and I will gladly reproduce the same issue on 1.200 V then provide you with the relevant log output.

5. No, I have not used the XMP even though it exists - it only provides lower latencies, and the DRAM frequency remains the same 2666 MHz. Therefore, I have only used the default profile, for better stability. Frequency, as mentioned is 2666 MHz in dual channel, so nothing fancy, which is also according to memory specification for this CPU.

 

Let me know if you need additional information from my side.

 

Best regards,
Stefan

0 Kudos
DeividA_Intel
Employee
2,998 Views

Hello stefanpetronijevic, 



Thanks for the confirmation. In order to continue further, I would like you to try the following:


1. Verify with the motherboard manufacturer that all key hardware components are fully compatible, to confirm that the RAM, Storage device, and GPU (if applicable) have no issues.

2. Verify if it is possible to do a Windows Restore from a system restore point.

3. If the restore point did not work, please try to reinstall Windows.



Regards,  

Deivid A. 

Intel Customer Support Technician 


0 Kudos
Jocelyn_Intel
Employee
2,950 Views

Hello, @stefanpetronijevic.   

 

We are checking this thread and we would like to know if you were able to review our previous post. If you need further assistance, please do not hesitate to contact us back. 

 

Best regards,  

Jocelyn M.   

Intel Customer Support Technician. 


0 Kudos
stefanpetronijevic
2,938 Views

Hi,

 

I did as you requested and here are the results:

 

1. Components are fully compatible, there are no issues here, as expected, since all components are made by reputable brands, and the same configuration was running fine from the start, without any problems.

 

RAM - HX426C16FB2K2/16

https://download.gigabyte.com/FileList/Memory/mb_memory_z390-aorus-pro_191108.pdf?v=6af115148535138831d5ac39e7ec0fdd

 

M.2 NVMe - 970 EVO Plus MZ-V7S500BW

https://download.gigabyte.com/FileList/Document/mb_m.2_support_210915.pdf?v=dbfe67225871fda1c2108097fd1525a7

 

Both components were previously tested with HDD Sentinel and Memtest, respectively, without any problems.

 

2. I do not have system restore enabled, so I can't use it.

 

3. I have done a clean install of Windows on a separate drive, as I can't destroy the primary OS I am currently using. I have ensured that all latest drivers and OS updates are installed, reverted the vcore voltage to default 1.200 V and managed to reproduce the issue again. Everything is working with the vcore voltage bumped to 1.300 V on the clean install too.

 

This was also expected, since the problem was previously confirmed on live distros and PE disk, as initially mentioned, to eliminate the possibility of this problem being OS related.

 

Best regards,
Stefan

0 Kudos
DeividA_Intel
Employee
2,926 Views

Hello stefanpetronijevic, 


  

Thank you for the information provided 


  

I will proceed to check the issue internally and post back soon with more details. 


 

Best regards, 

Deivid A.  

Intel Customer Support Technician 


0 Kudos
DeividA_Intel
Employee
2,907 Views

Hello stefanpetronijevic, 



Based on the investigation and the troubleshooting that you have performed, I recommend you get in contact with us directly to check the warranty options that we have for you.


Feel free to use any of the following methods to get in contact with us: 


1. Chat support:

2. For phone support, depending on your location, you will see the contact information on the links below:  


 

Please keep in mind that this thread will no longer be monitored by Intel.  


Regards,  

Deivid A.  

Intel Customer Support Technician  


0 Kudos
Reply