My question is how the heck do I determine which drive is about to fail, without having to pull the system offline and run SMART checks on each drive.
Here's the event:
Controller ID: 0 PD Predictive failure:
OS is Server 2008 R2
Thanks in advance for the assistance.
The predictive fail means that the drive is telling the controller though a S.M.A.R.T. SAS notification that the drive is having internal errors.
It's difficult to say which drive is having issues. I assume from the tags that you're using the SRCSASRB controller, correct? Is the controller connected to an expander? If so, what expander are you using?
The error should point to an error on the drive that's the first drive in the backplane that is connected by a wide port. You only get a wide port with an expander. With the SRCSASRB you're probably using one of the older expander backplanes. If that is the case, it's likely a predictive fail in the drive in the bottom slot.
Thanks for the assistance John.
You are correct, the controller is the SRCSASRB connected to the AXX4DRV3GEXP expander, (perhaps this is info I should have originally posted). I was able to finally locate the drive via the RAID Web Console 2. When viewing PHYSICAL drives there is actually a "Pred Fail Count" indicator on the right side, which showed a total count of 2 incidents. And yes again you were correct, it was the bottom drive in the expander.
So took the drive offline (after a backup) and swapped it out with a replacement. The rebuild is running and we should be back in business within the next few hours.
Looking back on the event error log, it does indicate the location of the drive.
Int.Ports 4-7:2:0, I have the expander connected to PORT 2 of the controller as indicated by "Ports 4-7", it is the second expander in the server "2:0", and the drive is drive 0 in this expander "2:0".