Hey folks -
I posted this on Reddit to r/techsupport already and the preliminary consensus is a faulty CPU (i7 9700k). Wanted to post here to see if you guys had any thoughts before I went and got a new one. Have links to all the dump files below. Thanks a bunch!
My Reddit post:
I'll preface by mentioning I don't know what I am doing and I hope the fix is something stupid I missed. I have been troubleshooting this for a couple days now and still am going to try a couple more things but wanted to post to see anyone had any other ideas. I appreciate all the help!
Summary of Events:
Built a brand new gaming computer with the following parts: https://pcpartpicker.com/list/wLqJyk
No issues with the build that I know of (regret getting that case - wiring was tough). Turned on fine, booted Windows 10 Pro from a USB drive and everything was dandy. I installed all my typical software (Steam, Firefox, Discord etc.) and played League of Legends, Modern Warfare (on max settings), Overwatch no problem. Next day I started getting BSOD messages during normal web browsing which I started to dig into. I ended up doing a fresh install of Windows and properly updating MB BIOS/Drivers through the MSI Dragon Software (enabled XMP to get 3200 on RAM - didn't do this the first time), Windows Drivers and doing some ~1 hour "stability" checks with Intel Extreme Tuning - also ran PCMark and 3DMark. Everything ran great, CPU max temp was ~60 C. Fast forward to later that night, I got a BSOD while playing Rise of Nations on Steam (ha) and then I started getting them more frequently right after I logged on to Windows. Each BSOD message was different which Google mostly said it could be a RAM issue. No overclocking has been attempted. Windows Safe Mode works fine - running all the memory checks through it as well as the writing of this post.
I ran Windows Mem Check and it ran with no errors found, I then ran MemTest86 overnight with 0 errors found (took 5 hours and 23 minutes for 4 passes - seems a bit fast given I have 32 gb of RAM). A couple Google posts I read said that even if all this passed, it didn't mean my RAM was fine so I went ahead and bought the same RAM at BestBuy which I plan to go pick up later today to swap out - I can cancel the purchase if that's not the right fix.
I then ran chkdsk /r on both the EVO 970 and EVO 860 which came back with no issues. The EVO 860 chkdsk took a good 30 minutes while the EVO 970 chkdsk took like 7 minutes. Next step is for me to do another fresh Windows install but disconnect the EVO 860. Read online that Windows installs sometimes get messed up with multiple HD's installed - this seemed silly to me until I saw that when I try to access my mb BIOS, it tells me that my Windows boot is on the EVO 860 althought I specified it to be on the EVO 970 which is what is shown in the disk management settings tab.
I took a stab at interpreting these myself but I was way over my comfort zone and Google searches were becoming too technical (which triggered this post). You can find the 5 miniDump files here: https://www.dropbox.com/s/j3bzyze6umhibtk/MINIDUMP%20REDDIT.zip?dl=0
I installed WinDbg on my laptop to try reading these. The very first one showed a WIN8_DRIVER_FAULT with Corsair.Service.CpuIdRemote64.exe which I am pretty sure is the iCUE Corsair program. I had a mini eureka moment until all the other minidump files did not show that as the issue (another was DCv2.e which I was not able to find what it was). Maybe I shouldn't install iCUE when I do my fresh Windows install?
Anyway, I'll keep troubleshooting with the stuff I mentioned above. I'll post an update once done. I really appreciate any help you guys can send my way - if you want me to run a specific test or need the results from any of the tests I've already run, I can upload.
Cheers! -G alan8
Update 1: Another Reddit post had similar issues to me and it was fixed by running with just one RAM card. I tried doing this with both cards individually and still got BSOD errors a minute or two after logging in. Got WHEA_UNCORRECTABLE_ERROR, SYSTEM_SERVICE_EXCEPTION, another WHEA, and MACHINE_CHECK_EXCEPTION. Going to try that fresh install now.
miniDUMP files for this update: https://www.dropbox.com/s/ksox3pm44ksrl7o/miniDump%20RAM%20Tests.zip?dl=0
I disconnected the EVO 860 and did a fresh install via USB. Everything was going swell until I was selecting my keyboard configuration for Windows and it froze - waited 10 min and no dice. I forced shutdown via the normal power button and tried again. Got further but during the part where I select if I want to share information with Windows my screen went black and my fans increased rpm. Only thing on my screen was the mouse cursor so I waited it out and 30 seconds everything popped back up and fans returned to normal rpm. Completed installation, got MSI Dragon Center and installed all the drivers. Once installed, got BSOD WHEA_UNCORRECTABLE_ERROR. The first thing I did when I logged on to windows was install Intel Extreme Utility to monitor package temperature. My MSI Meg Z390 Ace also has an LED temperature monitor. These never got around 45 C - only time I've seen them get higher was during Stress Test and that was maybe 62 C tops.
With all the RAM testing I've done and this last fresh install, I agree with you that the CPU is suspect. How can I determine if the CPU is the problem or the Motherboard? Other than just throwing a new CPU on there?
Latest dumpfile in case it's useful: https://www.dropbox.com/s/nfgtieva9pnnoqe/120719-9578-01.dmp?dl=0
Other than replacing the CPU, I am out of ideas. I am going to pull it and go get a new one. Not sure how else to confirm CPU vs Motherboard at this point.
Hey Al - Thanks for taking a look.
What makes you think that? That's what I thought too initially. I backed off from thinking that after I ran Windows Memory Diagnostic (no errors), MemTest86 (no errors) and a tried booting up with each RAM card (1x16) individually.
The memory diag is often incorrect. And, these sporadic crashes are typically the result of bad, or incorrectly configured.
Easy enough to check. Use memory recommended by the motherboard manufacturer as a test.