Processors
Intel® Processors, Tools, and Utilities
14395 Discussions

Hardware failure

PeymanRad
Novice
4,403 Views
Hi dear members of intel community

I bought a rog strix z490-e gaming about a month ago and a 10700K and also vengeance lpx (2x8GB) DDR4 3200MHz for memory-970 gtx g1 gaming- PSU rog strix 850W G+

After building up the new PC, I installed windows 10 without any problem. And I ran an Aida64 stress test for about an hour or so and everything ran smoothly. The CPU temperature also hovered around 80 to 85 at maximum. So I figured everything is working properly. Because I didn’t have enough knowledge about my motherboard drivers, I went to the website and downloaded every driver that was available for my motherboard and tried installing them. Among these drivers I installed AHCI to Intel RST premium Driver. After restarting the windows there was nothing but a black screen. I couldn’t even boot into bios and the After building up the new PC, I installed windows 10 without any problem. And I ran an Aida 64 stress test for about an hour or so and everything ran smoothly. The CPU temperature also hovered around 80 to 85 at maximum. So I figured everything is working properly. Because I didn’t have enough knowledge about my motherboard drivers, I went to the website and downloaded every driver that was available for my motherboard and tried installing them. Among these drivers I installed AHCI to Intel RST premium Driver. After restarting the windows there was nothing but a black screen. I couldn’t even boot into bios and the VGA LED on the motherboard (the white LED) was always on. So I figured the problem might be something with a graphic card. So I change the HDMI port from the back of my graphic card to the back of my motherboard. And then I could perfectly load into bios and also windows perfectly fine. Because I knew the RST driver did this (somehow my 970 GTX didn’t work on RST premium), I went to my bio settings And changed it back to AHCI. After this I reinstalled my windows to make sure no problems would surface accordingly. After installing windows I ran AIDA64 for another hour and I also used the Intel XTU to see if all my components are working properly and they did. (question: why couldn’t I set it to RST without a problem?)
Until yesterday I was absolutely doing nothing with just a picture open in my windows photos that I got a BSOD. Before I could check the code the screen disappeared and the computer restarted.
After I got the blue screen, I ran AIDA64 another time and this time after a few minutes I had “hardware failure detected” message and the test stopped. I ran the test with stress Cpu, stress FPU, stress cash, stress system memory (just these).
I tested another time and then again I got the error and my pc restarted as if I restarted it myself . So I figured it might be because of the ram overclocking via the XMP1 configuration. I restored everything to default in my bios and started the test another time. This time I got Aida 64 'hardware problem detected' again after 7 hours. I also tested my memory with mem86 on xmp1 xmp2 and stock with no errors.
Today I tested with Aida 64 again this time with the stress system memory unchecked. (Only the CPU) and after 10 hours I got hard for failure detected again.(I am using stock bios settings with just XMP2 on). I also have interest to you running all the time to check if there is any throttling. And always when I’m running Aida 64 , intel XTU shows power limit throttling. So I figured that might be because of the motherboard bios (ver 707) default settings. And now I have no idea what to do and how to test the system further and if there is a real hardware problem or not. I hope you guys can help me and I looking forward to your answers. It also spots hardware failure on prime 95 after a few hours. I will send you the result below:

Self-test 512K passed!
Self-test 512K passed!
[Tue Dec 1 22:31:46 2020]
Self-test 512K passed!
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.4990234375, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
Self-test 560K passed!
Self-test 560K passed!
Self-test 560K passed!
[Tue Dec 1 22:36:56 2020]
Self-test 560K passed!
FATAL ERROR: Rounding was 0.498046875, expected less than 0.4
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
Hardware failure detected, consult stress.txt file.
Self-test 560K passed!


Thanks in advance.
0 Kudos
12 Replies
n_scott_pearson
Super User
4,395 Views

What a mess (and incredibly hard to read too). I think you figured this out on your own but I will say it anyway: Unless you are going to use RAID or Optane, do not install Intel RST. There is no upside to doing so and the downside can be problematic. 

Moving on, I want to make sure I understand is correctly. You set the BIOS back to AHCI and reinstalled Windows, right? If so, Intel RST is not in the picture, which is good (one variable eliminated).

So, you are having memory errors. You want to know why it is happening now but seems ok initially. Well, consider that the time between then and now as being your burn in for the memory. It is during burn in that failures in the memory can occur. [Aside: in this context, when I say 'memory', I am including the processor caches, the processor memory controllers, the motherboard support for the memory buses and the memory DIMMs themselves in the picture.] 

First thing you should do is download Intel's processor diagnostic test (from here: https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool) and run it. Only continue to memory testing if this test passes completely.

For memory testing, I usually use MemTest86 (or MemTest86+ if you still have a Legacy Boot capability) that can thoroughly test the memory and identify any bad spots.

So, baseline your testing by disabling XMP completely and testing the memory thoroughly. Only if it passes completely should you enable XMP.

Test as each of the XMP levels offered, working your way up to the maximum speed.

Hope this helps,

...S

PeymanRad
Novice
4,389 Views

Thank you so much for replying

Actually I did test the memories with Memtest86 in the same order you mentioned yesterday. 
I first tested them with the default settings and then with XMP2 on. Both without any errors. I also used AIDA64 to just test the Cpu. (With the stress system memory option unchecked ) and I got “hardware failure detected” error after 10 hours. Having done this, I figured that the problem is not my memory sticks.
I also installed intel processor diagnostic tool (4.1.4.36) And the test started by itself. Though, I’m not sure if it’s working correctly or not, or whether I should change any settings. Because it’s only testing CPU1 and all the tests for all the test modules got a green pass

waiting on further instructions

0 Kudos
Maria_R_Intel
Moderator
4,366 Views

Hello PeymanRad,


Thank you for posting on the Intel Community.


As the community member commented, unless you are going to use RAID or Optane, there is no need to install Intel RST.


The errors are from the memory and it seems (with the IPDT results) that your processor is working properly since it passed the test.


To better assist you, please provide us with the below information:


  • Besides the errors on the tests, are you experiencing issues related to performance or a new Blue Screen Error?


Provide the Intel® System Support Utility (Intel® SSU) 

 


Best regards,

Maria R.

Intel Customer Support Technician


0 Kudos
PeymanRad
Novice
4,355 Views

Hello again
Thank you for following up,

In terms of temperatures, CPU temperature is not going above 83 on stock settings. But usually hovers around 78 to 80c.

I have done everything you can possibly think of. Updated the bios, clear the CMOS, changed memory slots and reseated the CPU.

I had a blue screen the other day with the “memory management” code on totally STOCK SETTINGS. So immediately I figured it might be the bios memory settings.
And before that I had errors in prime95 and linpack xtreme. Also whenever I tested with OCCT, I got above 12,000 errors within just 15 minutes!! (again on just stock settings).

I also tested with ‘windows memory diagnostic tool’ and it detected hardware problem too . I also tested OCCT with XMP1 and XMP2 . Both with lots of errors .

So I figured that if I put in the memory settings and voltages manually, the problem might go away.
So I put the RDAM voltage on 1.4 and VCCIO and VCCSA on 1.2. And also I put in the 16-18-36 for ram timings.

After this the problems in OCCT and realbench immediately disappeared and windows memory diagnostic tool also didn’t show any more problems.

And now the strange thing is, that when I clear the CMOS and put it on all stock again, it shows no more errors!!! as if all the stock-setting errors all vanished.

Now I don’t have any idea what the problem was or how it disappeared and I’m afraid that it might resurface again .

on stock settings my DRAM voltage is on 1.200 - VCCIO 0.976v - CPU System agent voltage
1.056v And timings are 15 15 36.

This has been bothering me for quite some time now 

I would appreciate any help or suggestion.

I will share the SSU results below.

 

0 Kudos
Maria_R_Intel
Moderator
4,343 Views

Hello PeymanRad,


Thank you for the information.


Based on the information provided, that you have been changing the frequency and voltages, it is possible that one of the components, memory, or processor has been affected.  


The Intel* XMP is considering overclocking, altering clock frequency or voltage may damage or reduce the useful life of the processor and other system components and may reduce system stability and performance. Product warranties may not apply if the processor is operated beyond its specifications, in this case, according to the SSU report, the memory is running at 3200MHz, the maximum speed as per Intel specifications is 2933MHz.


We understand that after the Clear CMOS process, reverting all to stock settings, the issue stopped, and that is the Intel* recommendation, use the system in default. Previous changes as mentioned above may already affect the PC performance.


Please, keep monitoring your system using the stock settings, and let us know if the behavior came back.


Best regards,

Maria R.

Intel Customer Support Technician


0 Kudos
PeymanRad
Novice
4,331 Views
Tnx
I will continue monitoring and get back to you.
Is it ok to use XMP1 on 2933MHz ?
0 Kudos
Maria_R_Intel
Moderator
4,310 Views

Hello PeymanRad,


Yes, the processor would be working under the specifications.


If you want to check the specs, you can use this link https://ark.intel.com/content/www/us/en/ark/products/199335/intel-core-i7-10700k-processor-16m-cache-up-to-5-10-ghz.html


Best regards,

Maria R.

Intel Customer Support Technician


0 Kudos
PeymanRad
Novice
4,304 Views
Tnx a lot for your support
I will do so and get back to you if any problems come up.

Best regards
0 Kudos
PeymanRad
Novice
4,279 Views
Hello again,

I have one Ssd evo 850 , an evo 970 m.2 and 2 HDDs.

I want to start using intel RST

Should I simply turn on RST in bios and install windows again?

0 Kudos
n_scott_pearson
Super User
4,276 Views

To accomplish what? RAID0/1 with the two HDDs is all you could do.

...S

PeymanRad
Novice
4,271 Views
Tnx for following up,

What if I use only 1 Hdd with those 2 SSDs ?

And do I need to install RST premium driver after installing Windows?

0 Kudos
n_scott_pearson
Super User
4,262 Views

Here's what you can do:

  • You can set up a RAID0 or RAID1 array using two M.2 NVMe SSDs, provided that (a) the PCIe lanes in their M.2 connectors come from the PCH (chipset) and (b) redirection is enabled for both drives in the BIOS configuration.
  • You can set up one or more RAID arrays involving appropriate number(s) of SATA drives (HDDs and/or SSDs).

Here's what you cannot do:

  • You cannot set up a RAID array involving both SATA-based and M.2 NVMe-based drives.
  • You cannot set up a RAID array involving a M.2 NVMe drive if the PCIe lanes in their M.2 connector do not come from the PCH (chipset).   [Aside: this can only be done on platforms that support Virtual RAID On CPU (VROC).]

Clear as mud? I have likely created more questions that I answered.

...S

 

0 Kudos
Reply