I've got a "whea uncorrectable error" 3 times in 24 hours. After the system collects info and reboots, the RST shows Volume 1 in rebuild status. Everything I've read says to replace one of the drives. In the RST screen, both drives have green check marks on them. How do I tell which drive went bad?
Is it possible that since both drives have green checks, the controller has gone bad?
Additional info: The first rebuild finished about midnight last night. PC worked fine until about 3 pm today where I got the whea uncorrectable error again and the system started rebuilding.
Windows 10, 64 bit, IRST version 126.96.36.1999, Asus Maximus VIII Hero Alpha motherboard, 2 WD Black 6TB drives
Thanks in advance for any ideas.
Based on your description, it seems like the drives are fine this since you mentioned the green checks, if you had one bad a red "X" should appear. Please try the following suggestion in case this is related to the Intel® RST driver; update your current version to the one mentioned below.
Intel® Rapid Storage Technology (Intel® RST) RAID Driver 188.8.131.520;
https://downloadcenter.intel.com/download/26361/Intel-Rapid-Storage-Technology-Intel-RST-RAID-Driver... Download Intel® Rapid Storage Technology (Intel® RST) RAID Driver
Please let me know your results.
Thank you for the link. The link had a newer version than what was available on the Asus website.
I've encountered the "whea uncorrectable error" two additional times but each time, the RAID volume was not affected. Not sure if it is coincidental but this is a huge improvement. I think this problem is highly likely to be caused by a hardware failure of some sort.
/thread/112882 wdonn, sure you're welcome.
Yes, like you mentioned it could be hardware related. Have you tried replacing the drivers?
Thank you for the response and suggestion.
I've looked at the 10 Windows mini-dumps and they all start with "WHEA_UNCORRECTABLE_ERROR (124) A fatal hardware error has occurred. Parameter 1 identifies the type of error". The error is always Machine Check Exception (which I think means the CPU is forcing a dump). There isn't any consistency in the process running at the time of the dump.
As best I can tell, my drivers are up-to-date and Device Manager does not indicate any issues with the drivers.
I'm going to run some hardware stress tests (mem, CPU, HDD) to see if that results in a blue screen. There is no consistency in the uptimes shown in the mini-dumps. CPU temp is never higher than 33 C.
I ran 12 hours of memory tests using memtest and there were no errors. HD Tune Pro identified communication issues with my SSD suggesting it might be a bad cable. As best I can tell, HD Tune will not give a health check on a RAID drive.
Thank you for the reply.
I have not replaced the cable. I've read several forums that describe people who have replaced the cable and still get that status from HD Tune. I've also run Samsung Magician. It gives the same CRC error data but indicates that the status is "OK".
I've had a couple of additional BSODs. Again the good news is that since upgrading to the latest version of Intel RST, I've not lost a drive after a BSOD.
I thought I would try to bring this thread to closure. After many tests, chats and emails, it was determined that my motherboard had problems. I replaced it a 5 days ago and so far everything is working properly. Thank you for the timely responses and support.