ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    IT Survey: Preemptive Drive Replacement in RAID Arrays

    IT Discussion
    storage raid winchester drive survey
    11
    44
    10.6k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller
      last edited by

      So I'm not asking for anyone to say why it was done or who did it. I just want to know, because sometimes we miss what is a common practice because we just never ask or talk to the right people, but how many people know of any environments where RAID array drives that are perfectly healthy are intentionally removed and replaced with new drives as part of routine maintenance to "keep the array healthy?"

      This is a new process that I had never heard of until today so wanted to get a quick survey and see if anyone else is aware of this practice. Is it common?

      1 Reply Last reply Reply Quote 1
      • gjacobseG
        gjacobse
        last edited by

        I can't say that I have heard of this being done to 'keep an array healthy'

        I have wondered if it was possible to do something SIMILAR as a method of making backups....

        1 Reply Last reply Reply Quote 1
        • gjacobseG
          gjacobse
          last edited by

          Could it be that they keep a 'bank' of drives and to cycle through them keeps them at a 'balanced' run time, similar to rotating tires (tyres) on a car?

          scottalanmillerS 1 Reply Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller @gjacobse
            last edited by

            @gjacobse said:

            Could it be that they keep a 'bank' of drives and to cycle through them keeps them at a 'balanced' run time, similar to rotating tires (tyres) on a car?

            I would not think so, the stated purpose was to specifically unbalance them.

            1 Reply Last reply Reply Quote 0
            • JaredBuschJ
              JaredBusch
              last edited by

              I have never heard of it, nor would I do it.

              It is a waste of money. From this side of RAID stuff, the entire point of the RAID array is to be able to operate degraded with a drive down.

              1 Reply Last reply Reply Quote 1
              • Deleted74295D
                Deleted74295 Banned
                last edited by

                Can we do a wider poll? Right now this is just limited to the MangoLassi membership.

                But a single poll website, advertised on different communities might give a different insight in terms of metrics.

                Only time I have replaced drives is when I start hearing less than healthy noises from them on client machines, never on a live raid array.

                1 Reply Last reply Reply Quote 0
                • DustinB3403D
                  DustinB3403
                  last edited by

                  I've heard of doing it every 2 - 3 years, but not as a part of routine maintenance.

                  What is schedule for routine maintenance with where you heard this?

                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                  • coliverC
                    coliver
                    last edited by

                    I've never heard of this until today. So no we don't do it here nor anywhere else I've worked.

                    1 Reply Last reply Reply Quote 1
                    • DustinB3403D
                      DustinB3403
                      last edited by

                      To follow up, I've never performed it either. But have heard people say that they replace their drives to avoid the urgent rush of a RAID being depreciated, because of a failed drive.

                      coliverC nadnerBN scottalanmillerS 3 Replies Last reply Reply Quote 0
                      • coliverC
                        coliver @DustinB3403
                        last edited by

                        @DustinB3403 said:

                        To follow up, I've never performed it either. But have heard people say that they replace their drives to avoid the urgent rush of a RAID being depreciated, because of a failed drive.

                        Wouldn't it be just as good to have a cold spare on a shelf waiting for a failure?

                        DustinB3403D 1 Reply Last reply Reply Quote 2
                        • dafyreD
                          dafyre
                          last edited by

                          I've never done pre-emptive replacement on drives that are showing no errors.

                          1 Reply Last reply Reply Quote 0
                          • nadnerBN
                            nadnerB @DustinB3403
                            last edited by

                            @DustinB3403 said:

                            To follow up, I've never performed it either. But have heard people say that they replace their drives to avoid the urgent rush of a RAID being depreciated, because of a failed drive.

                            Eh? The array is in a degraded state while it's being rebuilt. What's the difference? Are they not running backups?

                            1 Reply Last reply Reply Quote 1
                            • DashrenderD
                              Dashrender
                              last edited by

                              I've definitely never heard of this before, and without some solid evidence on how the expense is worthwhile, data loss that is already super low is some how even lower, I wouldn't consider it.

                              1 Reply Last reply Reply Quote 0
                              • DustinB3403D
                                DustinB3403
                                last edited by

                                I don't understand the rational either.

                                1 Reply Last reply Reply Quote 0
                                • DustinB3403D
                                  DustinB3403 @coliver
                                  last edited by

                                  @coliver said:

                                  @DustinB3403 said:

                                  To follow up, I've never performed it either. But have heard people say that they replace their drives to avoid the urgent rush of a RAID being depreciated, because of a failed drive.

                                  Wouldn't it be just as good to have a cold spare on a shelf waiting for a failure?

                                  I absolutely agree that having a spare drive on the shelf is more effective, than replacing a drive even if it hasn't failed.

                                  Some people simply don't want to understand what has to be performed to rebuild the array when you replace drives just to replace them.

                                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                                  • D
                                    Drew
                                    last edited by

                                    I'm guessing this isn't exactly what you're referring to but I thought I'd add my experience anyway. I guess it depends on what you mean by "perfectly healthy". One manufacturer might consider a drive perfectly healthy while another might not.

                                    Certain arrays will look at bad blocks to decide to preemptively to stop using a drive and switch to a hot spare if the number of bad blocks has reached a certain percentage and then they will send you a replacement drive. The number of bad blocks that constitutes a drive that is perfectly healthy vs impending failure varies.

                                    I've contacted a vendor before and sent diagnostic logs on arrays that were going to fall off support to analyze drives that hadn't necessarily crossed that line but might raise a few flags to see if I could get some drives replaced.

                                    As for replacing drives that show no signs at all of failing but just replacing due to being a certain age. I've never done this.

                                    scottalanmillerS 1 Reply Last reply Reply Quote 0
                                    • DashrenderD
                                      Dashrender
                                      last edited by

                                      I mentioned this to an associate of mine and he came up with a possible situation where this could matter, but we both agreed it was pretty unlikely.

                                      His reason was, if the labor pool for emergency repair is small to handle all the emergencies that are happening. Of course there are tons of mitigations for this, but I though the general idea had merit.

                                      MattSpellerM scottalanmillerS 2 Replies Last reply Reply Quote 0
                                      • MattSpellerM
                                        MattSpeller @Dashrender
                                        last edited by

                                        @Dashrender Also maintenance on exceptionally expensive to access sites (think weather station in Greenland or something)

                                        coliverC scottalanmillerS 2 Replies Last reply Reply Quote 1
                                        • coliverC
                                          coliver @MattSpeller
                                          last edited by

                                          @MattSpeller said:

                                          @Dashrender Also maintenance on exceptionally expensive to access sites (think weather station in Greenland or something)

                                          That still doesn't make sense because of the failure curve of hard drives. We have no idea if the new drive will die immediately or soon after installation. They would then have to have a second maintenance event to replace the failed drive. Now this may happen either way but it makes more sense to wait until the drive actually fails then to preemptively replace it. Especially if you can get months to years out of the drive you would have replaced.

                                          MattSpellerM 1 Reply Last reply Reply Quote 1
                                          • MattSpellerM
                                            MattSpeller @coliver
                                            last edited by

                                            @coliver It makes more sense in that scenario than it does in any other I can think of!

                                            I can think of much better ways to setup a remote station like that - I'm just trying to see if there's a scenario where his advice is actually... good.

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 1 / 3
                                            • First post
                                              Last post