- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In present, I will offline a NVMe device if I get EIO or EROFS in I/O process.
I'm not sure it's a good solution to do that, reasons:
1. Lacking of prediction: I have to wait for a real error. In https://community.intel.com/t5/Solid-State-Drives/Estimated-life-remaining-When-to-replace/m-p/1213729#M26080, it gives a way to estimate life remaning.
2. Maybe too strict: EIO or EROFS may be caused by filesystem/VFS
Actually, I have three questions:
1. What's the right way to detect a broken disk in data center?
2. How to deal with user-facing error(e.g. EIO/EROFS or other erros when try to make I/O to disk driver)
3. What should I do in background to detect broken disk? And How often should I do this? Will it impact the performance?
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Answers,
- Best way to predictively detect is to query the drive's S.M.A.R.T. data. For Intel SSDs and on Windows, you can use the Intel Memory & Storage Tool, which will provide a health estimate for each drive based upon its S.M.A.R.T. data.
- Um, panic?
- Mostly answered this in #1. There are 3rd-party products that provide daemons/services that will monitor the S.M.A.R.T. data in the background and provide predictions. to sustainers.
Hope this helps,
...S
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Answers,
- Best way to predictively detect is to query the drive's S.M.A.R.T. data. For Intel SSDs and on Windows, you can use the Intel Memory & Storage Tool, which will provide a health estimate for each drive based upon its S.M.A.R.T. data.
- Um, panic?
- Mostly answered this in #1. There are 3rd-party products that provide daemons/services that will monitor the S.M.A.R.T. data in the background and provide predictions. to sustainers.
Hope this helps,
...S
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would add that S.M.A.R.T. is not an infallible technology. There are most certainly ways that a drive could fail suddenly, without warning or time for a prediction. It is thus important to maintain a backup and save file changes on a regular basis.
...S
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page