Intel® Optane™ Solid State Drives
Support for Issues Related to Solid State Drives based on Intel® Optane™ technology, Intel® MAS and Firmware Update Tool

When to replace a new SSD?

templex
Novice
824 Views

In present, I will offline a NVMe device if I get  EIO or EROFS in I/O process.

 

I'm not sure it's a good solution to do that, reasons:

 

1.  Lacking of prediction: I have to wait for a real error. In https://community.intel.com/t5/Solid-State-Drives/Estimated-life-remaining-When-to-replace/m-p/1213729#M26080, it gives a way to estimate life remaning.

 

2. Maybe too strict: EIO or EROFS may be caused by filesystem/VFS

 

Actually, I have three questions:

 

1. What's the right way to detect a broken disk in data center? 

 

2. How to deal with user-facing error(e.g. EIO/EROFS or other erros when try to make I/O to disk driver)

 

3. What should I do in background to detect broken disk? And How often should I do this? Will it impact the performance?

 

Thank you!

0 Kudos
1 Solution
n_scott_pearson
Super User
821 Views

Answers,

  1. Best way to predictively detect is to query the drive's S.M.A.R.T. data. For Intel SSDs and on Windows, you can use the Intel Memory & Storage Tool, which will provide a health estimate for each drive based upon its S.M.A.R.T. data.
  2. Um, panic?
  3. Mostly answered this in #1. There are 3rd-party products that provide daemons/services that will monitor the S.M.A.R.T. data in the background and provide predictions. to sustainers.

Hope this helps,

...S

View solution in original post

0 Kudos
2 Replies
n_scott_pearson
Super User
822 Views

Answers,

  1. Best way to predictively detect is to query the drive's S.M.A.R.T. data. For Intel SSDs and on Windows, you can use the Intel Memory & Storage Tool, which will provide a health estimate for each drive based upon its S.M.A.R.T. data.
  2. Um, panic?
  3. Mostly answered this in #1. There are 3rd-party products that provide daemons/services that will monitor the S.M.A.R.T. data in the background and provide predictions. to sustainers.

Hope this helps,

...S

0 Kudos
n_scott_pearson
Super User
788 Views

I would add that S.M.A.R.T. is not an infallible technology. There are most certainly ways that a drive could fail suddenly, without warning or time for a prediction. It is thus important to maintain a backup and save file changes on a regular basis.

...S

0 Kudos
Reply