Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4778 Discussions

raid still reporting predictive failure after drive replacement

JJose13
Beginner
1,478 Views

I have a machine running centOS 5.3. It has a 6-disk raid 5 array. According to the raid web console 2, The raid card appears to be SRCSASBB81.

About a week ago, I started to receive these predictive failure warnings (once per day).

Controller ID: 0 PD Predictive failure: --:--:4

Generated on:Mon Sep 16 08:29:57 2013

SYSTEM DETAILS---

IP Address: REDACTED

OS Name: Linux

OS Version: 2.6

Driver Name: megaraid_sas

Driver Version: 00.00.04.01-RH1

IMAGE DETAILS---

BIOS Version: 1.12.122-0393

Firmware Package Version: 8.0.1-0029

Firmware Version: NT16

So, I started the intel raid web console, looked at all the drives and saw that drive 4 did have a "pred fail count" of 1. All the other disks had 0 in that field. I figured that's what the "--:--:4" in the warning was referring to. I backed up everything on the raid, identified the physical location of all drives then using the raid web console took drive 4 off line (putting the raid into a degraded state). The light on the physical drive in the expected location turned orange - as expected. I removed the disk and replaced it with a new one. The raid rebuilt and came back to optimal with the new disk. All went as planned. Yay!

However, every morning at 7:30 AM, I still get this same predictive failure warning. The "pred fail count" on the new disk (like all the others) is now 0. Everything looks fine. Is there some file where I have to manually reset some failure count? I can't see anything in the UI that indicates there is something else I need to do.

Please help me understand what's going on and what further steps I should take

Thanks.

-J

0 Kudos
0 Replies
Reply