- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
need a little advice on problem with RS3DC080, It returns me
LOCALIZED MESSAGE = Controller ID: 0 Single-bit ECC error;
critical threshold exceeded: ECAR = 701625440 ,
ELOG = 8396800 ,
( Src: Data Bits lane bitmap=0080, bank bitmap=00, elog 802000)
It works together with supermicro backplane BPN-SAS-825TQ(is in THOL list) with drives 0F23021/HGST ( HUS726060ALE614 6TB )
Firmware on raid card is most recent(Flash package = 24.21.0-0091),
Is it correct to start seeking problem in connections(cables) or you have experience that the drives/backplane or controller can cause such problems?
How probable is that the controller itself is faulty?
Warranty will be done over our local distributor, however I want investigate if this is warranty question rather than the system integrator fault.
Patrol read is ok for all drives.
There is exactly same server built same time with same hardware, so it does not look like incompatibility question.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just for information.
On site i changed cables, the error remained after a few hours.
Then I replaced the raid controller to new one, and imported foreign configuration. Few weeks works fine.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello mpetr33,
Thank you for joining the community
Could you tell where are you seeing these error from? Is this a Intel server board or a Supermicro one?
I could suggest you to run the RAID Web Console 3 and/or the StorCLI tool to get a full readable log that we can check.
About if cables could be causing this issue its plausible but usually not common
Regards
Jose A.
Intel Customer Support Technician
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
rwc2, however good point to update to rwc3, i will do that and get back here with results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
this is a Intel server board s1200spsr with rs3dc080 in pcie x8 slot.
Installed rwc3 and inlcude a log from it that is after server restart. What interests me is this one
{
"eventId" : 202,
"sequenceNumber" : 8619,
"time" : "2019-10-9T14:34:11",
"description" : "Controller ID: 0 Single-bit ECC error; critical threshold exceeded: ECAR: 7.01625e+008 ELOG: 8.3968e+006 (Src: Data Bits lane bitmap=0080, bank bitmap=00, elog 802000)"
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello mpetr33,
Thanks for the updates. These ECC errors are repeating a couple times. Looks like they might be originated in the actual RAID controller memory used for cache. It is possible the RAID controller will eventually fail caused by it memory been faulty. The cables that you suspect are difficult to be the cause of these error though.
I would suggest to wait for the warranty replacement to arrive and rerun diags to confirm these ECC errors are gone.
Regards
Jose A.
Intel Customer Support Technician
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello mpetr33,
Do you have any further details, updates, questions or comments in regards to this issue?
This thread will be marked as resolved automatically in the next 72 hours if no activity is received.
Regards
Jose A.
Intel Customer Support Technician
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello mpetr33,
We will proceed to mark this thread as resolved. If you have further issues or questions just go ahead and create a new topic.
Regards
Jose A.
Intel Customer Support Technician
A Contingent Worker at Intel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just for information.
On site i changed cables, the error remained after a few hours.
Then I replaced the raid controller to new one, and imported foreign configuration. Few weeks works fine.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page