So, I have been circling the drain with Dell for several weeks now, as my servers are randomly rebooting. I have narrowed it down to my servers with the Broadwell proc. Also this was happening prior to the current to 2.7 bios, I was on 2.4.3 then updated to Dell bios 2.6.
Was the reboot issue known, prior to the security issue? Is anyone else seeing this? Any insight would be highly appreciated.
We received your inquiry and I understand that you are looking for information on the security issue. Please accept our apologies for any inconvenience this may be causing. We will be more than happy looking for a solution.
In order to start looking into the information could you provide us with the model of the processor and model of the del server?
To be clear, this is not the security issue. We were experiencing this prior to the CVE. I have a Dell case open, and am trying all possible avenues. The processor is Broadwell.
- CPU0000: Internal error has occurred
- PWR2262: The Intel Management Engine has reported an internal system error
- RAC0703: Requested system hardreset
- SYS1003: System CPU Resetting
- SYS1001: System is turning off
- SYS1000: System is turning on
- SYS1003: System CPU Resetting
Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Thank you for the information provided
Investigating the information you have provided it seems that we are facing a processor error, for troubleshooting proposes:
- Does the dell system works with 2 processors? If so, have you try booting the system with one processor at the time to identify which of the processors is presenting the problem?
- Did the processor come preinstalled when you got the DELL™ system?
This is very interesting. We have a Dell PowerEdge R630 Server with Intel Xeon E5-2640 v4 CPUs and we are getting exactly the same error after updating from BIOS 2.3.4 to 2.6.0:
Sometimes (it can be 2-3 weeks uptime) we get a intel management engine error followed by a hardreset.2018-01-20T07:53:32+0100SYS1003
System CPU Resetting.2018-01-20T07:53:30+0100SYS1000
System is turning on.2018-01-20T07:53:22+0100SYS1003
System CPU Resetting.2018-01-20T07:53:22+0100SYS1001
System is turning off.2018-01-20T07:53:06+0100RAC0703
Requested system hardreset.<td align=...
We have a systemic issue we have experienced this over 100 times. Dell is engaged but we have not made much process. We were told to move from 2.4.3 to 2.6.0 by Dell, the condition still persisted. We are at the point, we are seeing this 3-6 times daily, across multiple nodes, rarely the same node.Processor: Intel(R) Xeon(R) CPU E5-2680 v4, both 2.4.3 and 2.6.0 bios
Thank you for your reply and the information provided,
It's very unlikely that this a hardware issue since this situation has started after the BIOS update, you can try to present the information to dell see if they are able to replicate your situation for incites and possible troubleshooting on this particular case
We're also experiencing the same problem, on a Dell PowerEdge R530, with 3 spontaneous reboots so far in a 6 month period.
Like others with the same problem we're using E5 v4 (Broadwell) CPUs, in our case two E5-2603 v4.
Our iDRAC lifecycle logs also show the same sequence of log messages:2018-02-09T14:09:45+0000LOG007
The previous log entry was repeated 1 times.2018-02-09T14:01:59+0000SYS1003
System CPU Resetting.2018-02-09T14:01:57+0000SYS1000
System is turning on.2018-02-09T14:01:49+0000SYS1003
System CPU Resetting.2018-02-09T14:01:49+0000SYS1001
System is turning off.<td align="center" cl...
Thank you for joining the community.
As we have shared in the previous post here everything pointing to a BIOS update from DELL™, we recommend to also present this situation for them to replicate the situation, in addition, please check your private inbox
Hello from Italy.
We have the same issues on 2 Server DELL PowerEdge R730 .
Bios Version : 2.8.0
Firmware Version : 22.214.171.124
In the last 2 month on one of our server has spontaneously rebooted 3 times completely randomly , and the other one 1 times without nobody correlation .
Dell support says us to try to disable the front power button from BIOS setting, but today after 20 days of quiet the first host it was automatic reboot .
Here a screenshot :
Tomorrow the Dell support will verify the logs but at this time i have a bad feeling about this .
I will update you .
Was there ever a solution to this problem? I'm currently in the middle of CLIENT hell because I purchased and configured 3 new INTEL brand servers using Intel Silver 4114 scalable processors and I'm plagued by random reboots on ALL THREE systems. Intel support doesn't seem to have a clue, although they are responsive to the issue, they're just stabbing in the dark.
This seems like a huge flaw in the CPUs themselves or a firmware/microcode issue.
If anyone who got a resolution to this sees this post, can you please let me know what your results were?
In my case I was able to solve the issue by exchanging our CPUs to an older generation. Fortunately, we were able to do so as the motherboard supported this.
Can you confirm that you're experiencing the same error -- do you see the same error message in your system log?
Which OS are you running?
I found this too:
Thanks for the replies. Unfortunatly swapping CPUs for an older generation is not an option on this platform. Also, I already have the server set to performance mode in the BIOS without any positve change in the problem. The intel scalable CPUs are of the skylake family, yet still have the same problems of the broadwell series.
The error is the exact same, random reboot due to PECI over DMI issues and all three servers are running server 2016 and are hyper-v hosts.