Community
cancel
Showing results for 
Search instead for 
Did you mean: 
SLu16
Beginner
4,134 Views

Intel RST RAID verification and repair

This Intel web page describes how Intel RST verifies and repairs RAID volumes:

http://www.intel.com/support/chipsets/imsm/sb/CS-023081.htm http://www.intel.com/support/chipsets/imsm/sb/CS-023081.htm

It says that for RAID 1 and 10, RST does the following:

"Data on the mirror is compared to data on the source. If the data on the mirror does not match the data on the source, the data on the mirror is overwritten with the data on the source."

It says that for RAID 5, RST does the following:

"Parity is recalculated and compared to the stored parity for that stripe. If the newly calculated parity does not match the stored parity, the stored parity is overwritten with the newly calculated parity."

My questions are:

1) For RAID 1 and RAID 10, what if it's the source that contains corrupted data - will it really copy that corrupted data onto the mirror, causing the correct data to be lost?

2) For RAID 5, what if corrupted data was used to recalculate the parity - will it really overwrite the old correct parity with the new incorrect parity?

0 Kudos
20 Replies
Kevin_M_Intel
Employee
499 Views

Hello slintel,

Let me help you.

For RAID 1 and 10 It is correct what you are saying. If you are going to rebuild a disk using a new one, all information from the Existent RAID Volume Disk will be copied to the new disk including all files. If the Operating System In the existent volume disk s corrupted, all information will be mirrored to the new disk.

For Raid 5 the volume will copy all files that are missing to complete the parity.

Kevin M

SLu16
Beginner
499 Views

For RAID 1 and 10, I understand if I am putting in a new disk, then it will have to copy all the data (even corrupted data) from the existing disk to the new disk. But what if I have 2 drives in the array and just want to do a verify? If the source drive has corrupted data, why can't the Intel RST detect that and copy the data from the mirror drive to the source drive, instead of from the source drive to the mirror drive?

Kevin_M_Intel
Employee
499 Views

This is because Intel® Rapid Storage Technology will rebuild the array and structure from the existing one (depending on the raid type). So if a new drive is added to the volume, it will have copied the information from the other driver (existing driver on the RAID).

Kevin M

SLu16
Beginner
499 Views

What if no new drive is added. There are only 2 existing drives, and you do a verify and Intel RST finds a data mismatch. How does Intel RST know which drive has the correct data?

Kevin_M_Intel
Employee
499 Views

Based on internal algorithms on the software the Intel® Rapid Storage Technology knows where the data is but there is no record or visibility for the users to know what the location of the files is.

Kevin M

SLu16
Beginner
499 Views

So will the Intel RST ever overwrite good data on one drive with corrupted data from another drive? The web page indicates that it might, which would be very bad.

Kevin_M_Intel
Employee
499 Views

It might because it is copying the existent data from the other disk so if the data is corrupted, all information will be copied as well.

Kevin M

SLu16
Beginner
499 Views

So then the Repair feature might actually destroy data instead of fixing it. That's really bad. I will look into other RAID solutions instead of Intel's.

GPerk1
Beginner
499 Views

Does Intel care to respond, or should everyone seeing this thread accept slintel's conclusion as accurate? I find it very difficult to believe Intel would simply surrender.

Miguel_C_Intel1
Employee
499 Views

The Intel® RST repair feature erases the information of the faulty Hard Disk Drive first then it will rebuild the RAID configuration using all the Hard Disk Drives.

Mike C

MStra16
Beginner
499 Views

But when the copies of data on two RAID 1 disks are different, how can the software tell which is the correct data and which is the corrupted data?

My situation is this...

I've just added 2 HDDs to my PC, set them up as RAID 1, and copied 2TB of data to them. I did a verify (but not repair) on the new RAID array, and nine hours later IRST reported that the array had three errors. Then I did a verify and repair on the array, and nine hours later IRST again reported that the array had three errors.

I was expecting the second V&R process to produce some sort of message/report telling me what happened. Does the Verify & Repair process create a log somewhere that I can check to see if it actually did a repair? Or do I have to do another verify and hope that nine hours later I'll be told that there are no errors? Is it possible that the errors are on part of the array that doesn't actually contain data? (The array is larger than 2TB.) If so, is IRST smart enough to mark those HDD sectors as bad and prevent the system from trying to use them to store my data?

MStra16
Beginner
499 Views

Further to my previous posting, I've done another verify (but not repair) overnight. And this morning I find that the system has found seven errors.

Either my brand new disks have problems, or Intel's 'fix' process is making matters worse rather than better!

Miguel_C_Intel1
Employee
499 Views

Hi MikeS1531,

I will need more details about the issue and your system before giving to you a troubleshooting.

Please download and run Intel® System Support Utility and send me the results. Also, take a picture of the errors and paste them to your post.

https://downloadcenter.intel.com/download/25293/Intel-System-Support-Utility Download Intel® System Support Utility

Regards,

 

Mike C

MStra14
Beginner
499 Views

Mike C,

Thanks for getting involved. I have downloaded and run the SSU, but it wasn't clear to me whether I was supposed to 'Submit' the info as the SSU requested, or respond via the forum, or attach the results in reply to the email I received from you. I tried the latter, but my email (it went to mailto:webadmin@intel.com webadmin@intel.com) was rejected. Should I have sent my response to a different email address?

 

 

Below is a screenshot of the display from the RST showing it had found 7 errors. I can't send an image of the RST page that showed I had 3 errors before I tried the Verify and Fix -- I didn't try to capture that because I didn't know I had a problem at that time -- but I expect it looked the same except for having a 3 where the 7 is below.

 

 

The SSU produced an XML file that I have saved and attached to this posting -- it contains far too much info to try to add screenshots of it here.

I think I managed to submit the above screenshot and the XML file via the Customer Service website, and that has created Service Request https://customercare.intel.com/ICS_CommunityCaseDetailsPage?lang=en-US&id=500U000000Tc3Cb 00507991 if that's of any help to you.

 

 

The issue remains as described further above -- I had 3 verification errors in my RAID array before I asked the RST software to fix them and 7 errors afterwards. Why might that have happened? Do I have a problem with my array? Is it likely to be unreliable?

 

 

I look forward to receiving your thoughts.
Miguel_C_Intel1
Employee
499 Views

Hi MikeS1531,

Please send me the system report of the RST and SSU.

System Reports for Intel® Rapid Storage Technology

 

http://www.intel.com/content/www/us/en/support/boards-and-kits/000006351.html http://www.intel.com/content/www/us/en/support/boards-and-kits/000006351.html

Intel® System Support Utility

 

https://downloadcenter.intel.com/download/25293/Intel-System-Support-Utility https://downloadcenter.intel.com/download/25293/Intel-System-Support-Utility

Note:

How to attach a file

Click reply

On upper right corner click Use advanced editor

Lower right section click Attach

Regards,

 

Mike C
MStra14
Beginner
499 Views

Mike C,

I have now created and attach the RST report you requested. Please note that I asked RST to do another Verify (but not fix) about an hour ago and that was still in progress at the time the RST report was generated. I see that it was 8% complete and had found two errors already. I can send another report after the Verify is complete. (That'll be at least 10 hours from now.) Hopefully it won't show more than the seven errors RST found during the last Verify.

 

I attached the SSU report in response to your last request, but I've attached it again to this message. I've also attached the msinfo32 report that your colleague Fred asked me to supply.

 

 

If there's anything more info I can supply that might help your investigation, please ask.
MStra14
Beginner
499 Views

Mike C,

Further to my message above, the verify finished, and I attach the post-verification RST System Report.

In fact, the verify showed only two verification errors this time. While I'm grateful for that, I can't say that I understand why there are less errors now (2) than there were on the previous verification (7), when I've not asked the RST program to fix any errors.

I suppose having two errors in nearly 2TB of data may be 'normal', or as good as it gets, but the main issue that remains -- especially now that the re-verify didn't produce the same result as the previous one -- is whether I can count on my RAID array to be reliable.

I should add that my main data storage volume is an external RAID array, and the array on my PC generally is used only for backup. I don't believe I've changed any of the data on the array I'm having problems with within the past week, so that shouldn't be the reason for the two verifications to have different results.

Any thoughts you have would be most welcome.

Miguel_C_Intel1
Employee
499 Views

Hi MikeS1531,

Thank you for your update. Fred is going to review the results of the tests and answer your issues through the existing case number.

Regards,

 

Mike C
KD4
Beginner
499 Views

Can someone give me an update on this issue as I would like to know what went wrong?

If not, could you possibly tell me the following:

1. Why these verification errors can occur, when hardware seems to be working perfectly?

2. Any implementation details on the MIRROR RAID 1. Reason I ask is because I want to know what the guarantees are of both drives ALWAYS having the EXACT IDENTICAL data being written to it?

Is there also something that guarantees BOTH drives ALWAYS get the same write, or is there situations where only one drive gets the write (apart from a drive failing that is). So for example maybe too many write requests from various threads and applications overloaded the Intel RST software, and as a result one of the drive was missing a write request - is this possible?