ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    RAID 10, 20 Disks, How Many Hot Spares

    Scheduled Pinned Locked Moved IT Discussion
    raid 10raidhot sparesstoragesw cp
    40 Posts 5 Posters 6.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller @MattSpeller
      last edited by

      @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

      @scottalanmiller Argue a use case then, don't dance around making me chase you.

      DOn't need a use case, risk aversion is the key. IOPS and capacity are spec'd properly, no more needed. Risk of the array is a concern. Hot spares would lower the risk, enlarging the array would increase the risk. This isn't complex. There is a goal: reducing risk. Your proposal is to undermine the goal for what reason? What makes you believe that risk protection is always bad and that higher risk is always good? Where would you stop with that logic? Always buy the biggest, fastest drives in the biggest possible arrays?

      MattSpellerM 1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller @MattSpeller
        last edited by

        @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

        My use case is on prem easy access. Define yours and maybe we can agree on something.

        1. No one even suggested that on prem was going on, that's a totally false assumption. So you can't make up a use case and then use it to make the "it's always this way."
        2. Just because on prem is easy doesn't make off hours easy.
        3. Just because on prem is easy doesn't mean that wasting money on cold spares makes sense when hot spares are more reliable and less effort.
        4. Just because on prem is easy doesn't mean that we should increase risk for no known reason when the goal was to reduce risk.
        MattSpellerM 1 Reply Last reply Reply Quote 0
        • MattSpellerM
          MattSpeller @scottalanmiller
          last edited by

          @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

          @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

          @scottalanmiller Argue a use case then, don't dance around making me chase you.

          DOn't need a use case, risk aversion is the key.

          Bullshit, this is how you gish gallop all over anyone who disagrees with you - moving the goal posts. It's irritating as fuck tbh.

          IOPS and capacity are spec'd properly, no more needed.

          Again, this is crap - I know you can do better.

          Give a real world example.

          scottalanmillerS 2 Replies Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller @MattSpeller
            last edited by

            @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

            @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

            @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

            @scottalanmiller Argue a use case then, don't dance around making me chase you.

            DOn't need a use case, risk aversion is the key.

            Bullshit, this is how you gish gallop all over anyone who disagrees with you - moving the goal posts. It's irritating as fuck tbh.

            Sorry, but thats exactly what didn't happen. The goal never moved, at all. The goal was to reduce risk, you have a personal agenda that risk should never be reduced only increased and you are saying anything, including now making a personal attack, to support it. But you are not at all looking at the needs of the OP, just interjecting some personal goal that doesn't align.

            No moving goal posts, none. You made up a new goal that didn't exist.

            1 Reply Last reply Reply Quote 0
            • scottalanmillerS
              scottalanmiller @MattSpeller
              last edited by

              @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

              IOPS and capacity are spec'd properly, no more needed.

              Again, this is crap - I know you can do better.

              Give a real world example.

              Not crap at all, it's how we do IT. You believe that "more is always better", no matter what. But only in IOPS and capacity, not in protection? By that logic, RAID 0 is always the best choice, right?

              1 Reply Last reply Reply Quote 0
              • MattSpellerM
                MattSpeller @scottalanmiller
                last edited by

                @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                My use case is on prem easy access. Define yours and maybe we can agree on something.

                1. No one even suggested that on prem was going on, that's a totally false assumption. So you can't make up a use case and then use it to make the "it's always this way."

                No one said it wasn't

                1. Just because on prem is easy doesn't make off hours easy.

                It does in my case

                1. Just because on prem is easy doesn't mean that wasting money on cold spares makes sense when hot spares are more reliable and less effort.

                Sure it does, in some circumstances - this is why you should define a use case so we can have a real discussion

                1. Just because on prem is easy doesn't mean that we should increase risk for no known reason when the goal was to reduce risk.

                Sure it does. This is not a black and white case, there are shades of grey.

                scottalanmillerS 4 Replies Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller
                  last edited by

                  Why do you need a scenario? We are past scenarios, we know the goal within the context which is to reduce risk. It's that simple. You are trying to make it complex so that you can take an arbitrary scenario and hope to shoot it down when we don't know the exact scenario, only the goal of risk reduction. Why are you so opposed to someone having a properly designed array for speed and capacity and considering lowering their risks? What's actually going on?

                  1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller @MattSpeller
                    last edited by

                    @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                    @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                    @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                    My use case is on prem easy access. Define yours and maybe we can agree on something.

                    1. No one even suggested that on prem was going on, that's a totally false assumption. So you can't make up a use case and then use it to make the "it's always this way."

                    No one said it wasn't

                    So because you inject your own details and no one specifically disputes them, they become true?

                    MattSpellerM 1 Reply Last reply Reply Quote 0
                    • scottalanmillerS
                      scottalanmiller @MattSpeller
                      last edited by

                      @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                      1. Just because on prem is easy doesn't make off hours easy.

                      It does in my case

                      And your case is not in question, so this is a red herring.

                      1 Reply Last reply Reply Quote 0
                      • MattSpellerM
                        MattSpeller @scottalanmiller
                        last edited by

                        @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                        @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                        @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                        @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                        My use case is on prem easy access. Define yours and maybe we can agree on something.

                        1. No one even suggested that on prem was going on, that's a totally false assumption. So you can't make up a use case and then use it to make the "it's always this way."

                        No one said it wasn't

                        So because you inject your own details and no one specifically disputes them, they become true?

                        That seems to be what you do 😛

                        scottalanmillerS 1 Reply Last reply Reply Quote 0
                        • scottalanmillerS
                          scottalanmiller @MattSpeller
                          last edited by

                          @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                          1. Just because on prem is easy doesn't mean that wasting money on cold spares makes sense when hot spares are more reliable and less effort.

                          Sure it does, in some circumstances - this is why you should define a use case so we can have a real discussion

                          Nope, cold spares don't work that way. If you have that magic use case, you can provide it. I know of no case where cold spares are better than hot ones except when the array is full for other reasons (not the case here - so we have your example case right now) or where you need to share them between many arrays (no reason to inject that odd assumption here.)

                          There is zero need for a use case, we know the factors already. That you CAN come up with a use case where these things are not true based on changing the fundamental goals is totally non-applicable to the situation.

                          1 Reply Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller @MattSpeller
                            last edited by

                            @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                            @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                            @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                            @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                            @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                            My use case is on prem easy access. Define yours and maybe we can agree on something.

                            1. No one even suggested that on prem was going on, that's a totally false assumption. So you can't make up a use case and then use it to make the "it's always this way."

                            No one said it wasn't

                            So because you inject your own details and no one specifically disputes them, they become true?

                            That seems to be what you do 😛

                            Okay, what detail did I interject? I'm working from the OP and nothing else. What have I added?

                            MattSpellerM 1 Reply Last reply Reply Quote 0
                            • scottalanmillerS
                              scottalanmiller @MattSpeller
                              last edited by

                              @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                              1. Just because on prem is easy doesn't mean that we should increase risk for no known reason when the goal was to reduce risk.

                              Sure it does. This is not a black and white case, there are shades of grey.

                              Whoa, you just said that "sure it does" meaning it's black and white and is always one thing. Then you say that there are shades of grey . Which is it, it can't be both. I made the case that it wasn't black and white, you disagreed and then said I was right.

                              1 Reply Last reply Reply Quote 0
                              • scottalanmillerS
                                scottalanmiller
                                last edited by

                                The OP is asking about one thing... how many hot spares to add to data protection in an array of this size. That's it. There are zero questions about needing more capacity or performance. None, zero. There is no info on where the array is hosted, none. The question is about one thing... risk. Risk and only risk. How much risk reduction is generally recommended.

                                Obviously the OP didn't provide enough info for anything but general cases and general guidelines. But what we know from the asking of the question is that their concern is "how much do they need to lower their risk." That's the only thing that they are asking. They aren't asking how to "best use additional drives", if they needed more drives we can assume that they would have a larger array than they do and would be asking about how many hot spares on a larger array.

                                We don't know if hot spares make sense, we don't have enough details. We only know that they rarely make sense in a 20 disk RAID 10. We do know that hot spares are always better than cold spares if the slots are empty otherwise, unless the cold spares need to be shared to other chassis to save money. But that's it. And since the question is about a single array, not a group of arrays, we have to ignore the use case where cold spares are a consideration. We also know that there are at least two open slots or else the question could not be asked at all.

                                So given what we know about the question, we know that the possible answers are no hot spares, one or more hot spares, and that is all. If we start suggesting things like "buy drives but instead of using them as hot spares, make your array bigger" we change everything. Not only do we make wild, unfounded assumptions about their risk profile which we are not in a position to make whatsoever, but we also go a massive step farther and start to make assumptions about their best use case of money.

                                So now, not only do we suggest that they increase risk rather than lower it like they were trying to do (based on what I keep asking, we know nothing to give us this leniency) but we then also take the money that they might have invested in risk protection and suggest not that they use it "where the business can most use it" but suggest that the only possible use case for that money is to invest it in disks? We know nothing about the cost of those disks, the utility of those disks, the finances of the company, where that money could be spent and the valuation of different investment strategies.

                                In no way could we make that recommendation without knowing a lot more. What we can, and indeed the only thing that we can tell the OP is how hot spares react, what their investment percentage is, and how often or rarely they are applicable in this type of array and what factors may or may not make them more or less valuable.

                                1 Reply Last reply Reply Quote 1
                                • scottalanmillerS
                                  scottalanmiller
                                  last edited by

                                  So you want a scenario? This is contrived and not mine to make but here we go...

                                  • SMBs should basically always have their servers in colocation facilities. What SMB has the facilities to host their own properly? Datacenters charge for manual labor and don't always provide easy access for vendors. Having a hot spare in the datacenter can be instant recovery happening instead of waiting hours or days for the vendor to get in with spare parts (it means you can get NBD deals instead of 4 hour ones to save money) adding tons of protection for very little money. This grows significantly if you don't have a vendor doing the swaps but plan to do it yourself. NTG's travel time to our old datacenter was four hours, for example.
                                  • Even in a datacenter, cold spares can take a long time to get put into place if the DC is busy, especially if things happen off hours. And there is risk that the wrong drive will be replaced, the server can't be found or whatever. Pay for a Tier IV and that stuff mostly goes away, but SMBs often are in lower tier DCs or do on premises and take risks that people will be less trained and make more mistakes.
                                  • IT Pros often don't understand RAID and will power down a machine when the RAID needs a drive replaced. A lot of people tackle this in the real world when they aren't the sole IT guy and are forced to make systems that are as self healing as possible because they don't always know who is going to be doing the work, especially years in the future when the systems will be most likely to fail. It's an investment in better processes. So even simple on premises systems have reasons why it can make sense.
                                  • Many SMBs don't have full time IT staff, that alone explains everything.
                                  • Many SMBs don't have on premises IT staff, again, totally explains it.
                                  • Many SMBs have fewer IT staff than they have physical locations.
                                  • MSPs often are not given blanket access to customer facilities and need to provide rapid protection faster than a customer may reliably be able to provide physical access.
                                  • Systems in remote locations do not always have reliable supply chains, especially outside of the US. Whether you are on an island in Lake Superior, in Matagalpa Nicaragua, on a cruise ship, in a research station on a mountain or in a state that gets way too much snow, hurricanes or flooding... having hot spares that can take care of things when staff and/or supply chains cannot get drives swapped promptly can be absolutely critical.
                                  • Many SMBs run without IT staff and need systems to be as self healing as possible.
                                  1 Reply Last reply Reply Quote 0
                                  • MattSpellerM
                                    MattSpeller @scottalanmiller
                                    last edited by MattSpeller

                                    @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                                    @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                                    @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                                    @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                                    @scottalanmiller said in RAID 10, 20 Disks, How Many Hot Spares:

                                    @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                                    My use case is on prem easy access. Define yours and maybe we can agree on something.

                                    1. No one even suggested that on prem was going on, that's a totally false assumption. So you can't make up a use case and then use it to make the "it's always this way."

                                    No one said it wasn't

                                    So because you inject your own details and no one specifically disputes them, they become true?

                                    That seems to be what you do 😛

                                    Okay, what detail did I interject? I'm working from the OP and nothing else. What have I added?

                                    We all come at this with different perspectives. You looked at this and assumed it was in a colo. I assumed it was on prem. We don't even know enough to speculate (but we do anyways because it's a fun thought experiment). We don't even know what it's hosting, what level of risk is acceptable to the business, etc.

                                    Given what we do know:

                                    "there is a single RAID array of 20 spinning disks in RAID 10 and the person asking wants to know how many hot spares would be recommended."

                                    If it were in a colo I'd put spares in it. If it were on prem I'd not waste a slot on hot spares unless there was a really insanely risk averse business case.

                                    scottalanmillerS 2 Replies Last reply Reply Quote 1
                                    • scottalanmillerS
                                      scottalanmiller @MattSpeller
                                      last edited by scottalanmiller

                                      @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                                      We all come at this with different perspectives. You looked at this and assumed it was in a colo.

                                      No, I did not and do not. I only assume that the question is about what is asked - the risk offset from adding more hot spares. Colo was only mentioned because you told me that I had to provide a scenario in which the OP was acceptable.

                                      I assumed and still do that colo is one of the options, but I have no idea what they are doing, only that they have an array and are now looking at risk offset values.

                                      1 Reply Last reply Reply Quote 0
                                      • scottalanmillerS
                                        scottalanmiller @MattSpeller
                                        last edited by

                                        @MattSpeller said in RAID 10, 20 Disks, How Many Hot Spares:

                                        "there is a single RAID array of 20 spinning disks in RAID 10 and the person asking wants to know how many hot spares would be recommended."

                                        If it were in a colo I'd put spares in it. If it were on prem I'd not waste a slot on hot spares unless there was a really insanely risk averse business case.

                                        Even in that case, I would rarely put hot spares in it in a colo. We have servers and have had servers in colos for years, both SMB and Wall St. enterprise and in both cases - no hot spares.

                                        Reason? Our risk aversion did not dictate that it was necessary and our colocation facilities could handle relatively rapid swaps of spare equipment. Colo makes hot spares somewhat more reasonable, but it is still a risk aversion and access use case primarily. Even there, I think it's rarely a good financial decision for most workloads.

                                        We use Colocation America right now, so our swaps would be about six hours. Four to five hours for the vendor to get the drive there, about an hour for them to coordinate, get the tech to the server, do the swap, etc. Well worth not wasting the money on the extra drives to sit around doing nothing for us.

                                        1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller
                                          last edited by

                                          I totally agree with @MattSpeller in that most companies would be better served by more IOPS and more capacity than they hav and that hot spares are relatively useless for them. That part I am totally in agreement with.

                                          MattSpellerM 1 Reply Last reply Reply Quote 0
                                          • dafyreD
                                            dafyre
                                            last edited by dafyre

                                            Why would you not put a hot spare in a RAID 10? -- Especially if you are trying to mitigate some risk of a drive failing.

                                            scottalanmillerS 1 Reply Last reply Reply Quote 1
                                            • 1
                                            • 2
                                            • 1 / 2
                                            • First post
                                              Last post