- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
Does anybody have any information about this error?
We have 3x Brand new S2600WTTR Intel Servers and all 3 are producing this error every few weeks:
SPS FW Health reports SPS Health event type FW status. PECI over DMI interface error. Recovery via CPU Host reset or platform reset. DMI timeout of PECI request.
This is causing a hard reset on each of the systems.
If anyone has any more info on what would be causing this it would be greatly appreciated
- Tags:
- Building Management
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Demandred,
It seems that the error comes from the Management Engine.
I was able to find more information about this in our http://www.intel.com/content/dam/support/us/en/documents/motherboards/server/sb/s2600v3_systemeventlog_troubleshootingguide_r1_0.pdf System Event Log Troubleshooting Guide.
Due to the nature of the issue, and the amount of systems impacted. I would recommend contacting our support group directly as they may request to have an entire log file to review this issue.
You may visit the following link for our support options.
http://www.intel.com/content/www/us/en/support/contact-support.html http://www.intel.com/content/www/us/en/support/contact-support.html
Best regards,
Dave A.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Demandred,
It seems that the error comes from the Management Engine.
I was able to find more information about this in our http://www.intel.com/content/dam/support/us/en/documents/motherboards/server/sb/s2600v3_systemeventlog_troubleshootingguide_r1_0.pdf System Event Log Troubleshooting Guide.
Due to the nature of the issue, and the amount of systems impacted. I would recommend contacting our support group directly as they may request to have an entire log file to review this issue.
You may visit the following link for our support options.
http://www.intel.com/content/www/us/en/support/contact-support.html http://www.intel.com/content/www/us/en/support/contact-support.html
Best regards,
Dave A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you ever resolve this issue? We're getting the exact same symptoms on a Dell PowerEdge R530 with 2 Xeon E5-2603 v4 CPUs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HelloColinTHart
As Dave A. stated: Due to the nature of the issue, and the number of systems impacted. I would recommend contacting our support group directly as they may request to have an entire log file to review this issue.
You could visit the following link for our support options.
http://www.intel.com/content/www/us/en/support/contact-support.html http://www.intel.com/content/www/us/en/support/contact-support.html
Best regards,
Caesar B.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I submitted a request to Intel but got the brush off saying it's a Dell system, contact them. To reiterate, we are getting the exact same symptoms on a Dell machine, so I'd really like to know of any remedial action taken. You can even send it to me in a private message if you don't want to share this information publicly.
Reading the Intel CPU errata, there are known issues with the CPUs we are using which Intel has marked as "No fix" so I'm primarily wondering if a CPU exchange is our best (only?) recourse.
Thanks,
Colin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I catch same error:
527 | 07/05/2019 21:01:47 (UTC) | SPS FW Health | OEM Reserved |
PECI over DMI interface error. This is a notification that PECI over DMI interface failure was detected and it is not functional any more. - DMI timeout of PECI request - Asserted
And after system is reset
MB:
Vendor: Intel Corporation
Version: SE5C610.86B.01.01.0018.072020161249
Release Date: 07/20/2016
Product Name: S2600WTTR
Version: G92187-366
CPUs: 2 x Intel(R) Xeon(R) CPU E5-2650 v4@ 2.20GHz (CPU family: 6, Model: 79)
Last normal reboot was: 07/04/2019 11:33:03
Beetween 07/04/2019 11:33:03 and 07/05/2019 21:01:47 - was no any load on the system
All messages from 07/04/2019 11:33:03:
536 07/05/2019 21:05:07 BIOS Evt Sensor System Event reports OEM System Boot Event - Asserted
535 07/05/2019 21:03:42 Physical Scrty Physical Security (Chassis Intrusion) reports LAN Leash has been lost - Deasserted
534 07/05/2019 21:03:34 Physical Scrty Physical Security (Chassis Intrusion) reports LAN Leash has been lost - Asserted
533 07/05/2019 21:03:17 IERR Processor reports it has been asserted - Deasserted
532 07/05/2019 21:03:17 Pwr Unit Status Power Unit reports the power unit is powered off or being powered down - Deasserted
531 07/05/2019 21:03:12 Pwr Unit Status Power Unit reports the power unit is powered off or being powered down - Asserted
530 07/05/2019 21:02:08 IERR Processor reports it has been asserted - Asserted
529 07/05/2019 21:02:06 BMC FW Health Management Subsystem Health 'DIMM Thrm Mrgn 2' sensor has failed and may not be providing a valid reading - Asserted
528 07/05/2019 21:02:06 BMC FW Health Management Subsystem Health 'DIMM Thrm Mrgn 1' sensor has failed and may not be providing a valid reading - Asserted
527 07/05/2019 21:01:47 SPS FW Health OEM Reserved PECI over DMI interface error. This is a notification that PECI over DMI interface failure was detected and it is not functional any more. - DMI timeout of PECI request - Asserted
526 07/04/2019 11:34:01 Physical Scrty Physical Security (Chassis Intrusion) reports LAN Leash has been lost - Deasserted
525 07/04/2019 11:33:48 Physical Scrty Physical Security (Chassis Intrusion) reports LAN Leash has been lost - Asserted
524 07/04/2019 11:33:03 BIOS Evt Sensor System Event reports OEM System Boot Event - Asserted
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We exchanged our Broadwell 2603v4 CPUs for slightly older Haswell 2620v3 versions and since then haven't had any more spontaneous reboots.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem is that you have populated your memory wrongly.
I've confirmed this is definitely the cause after spending most of the day re-arranging my DIMMs.
I've tried combinations of 2x, 4x, 5x, and 6x DIMMs in a while pile of different DIMM sockets - most of those combinations cause the error, but if you use specific slots, the error never happens.
If you're using only 6 DIMMs, populate A1, A2, B1 and G1, G2, H1
Unfortunately, the manual is not very helpful - besides saying "furthest away from processor" it doesn't actually give any of the ordering rules, which obviously are important!
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page