ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    RAID5 SSD Performance Expectations

    IT Discussion
    raid raid 10 performance ssd ssd raid5
    10
    50
    4.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • zachary715Z
      zachary715 @scottalanmiller
      last edited by

      @scottalanmiller said in RAID5 SSD Performance Expectations:

      Nothing your random writes are super high, way higher than those disks could possibly do. 10K spinners might push 200 IOPS. So 8 of them, in theory, might do 1,600. But you got 70,000. So you know what you are measuring is the performance of the RAID card's RAM chips, not the drives at all.

      Got ya. I may just have to evacuate this server for the time being and do some various testing with different RAID levels and configs to see how they compare. I just would have expected a little more noticeable performance difference than what I'm seeing. I've seen it in VMs all along where I didn't think they were as zippy as they should be, but they were quick enough for what we were doing so didn't really dig. But now I'm curious.

      1 Reply Last reply Reply Quote 0
      • jmooreJ
        jmoore @zachary715
        last edited by

        @zachary715 said in RAID5 SSD Performance Expectations:

        @scottalanmiller said in RAID5 SSD Performance Expectations:

        @zachary715 said in RAID5 SSD Performance Expectations:

        For my use case, I'm referring to MB/s as I'm looking at it from a backup and vMotion standpoint which is why I'm measuring it that way.

        That's fine, just be aware that SSDs, while fine at MB/s, aren't all that impressive. It's IOPS, not MB/s, that they are good at.

        What's a good way to measure IOPS capabilities on a server like this? I mean I can find some online calculators and plug in my drive numbers, but I mean to actually measure it on a system to see what it can push? I'd be curious to know what that number is even to see if it meets expectations or if it's low as well.

        EDIT: I see CrystalDiskMark has the ability to measure the IOPS. Will run again to see how it looks.

        I feel like I have used a powershell module to measure iops in the past. Can't remember right now though. Will investigate more when I get home.

        scottalanmillerS 1 Reply Last reply Reply Quote 0
        • scottalanmillerS
          scottalanmiller @jmoore
          last edited by

          @jmoore said in RAID5 SSD Performance Expectations:

          I feel like I have used a powershell module to measure iops in the past. Can't remember right now though. Will investigate more when I get home.

          It's still going to test the stack, rather than the drivers or array alone.

          1 Reply Last reply Reply Quote 0
          • T
            taurex
            last edited by

            Citing @StorageNinja: "...Do not use CrystalDiskMark for testing a hypervisor or server workload. It was designed for sanity tests on desktop systems. Virtual environments involve multiple disks on different controllers, on different VMs and parallel IO can either yield higher results or far worse (IO blender effect). Test something realistic with what you are running.
            VMware HCIbench isn't bad for spinning up a bunch of workers and running different profiles or DiskSPD and Intel IOmeter. If you are going to run SQL, HammerDB might be worth running (or SLOB if you will be running Oracle). Given people using CrystalDiskMark and stuff tend to either test unrealistically small or large working sets (and therefore test Cache, or what the storage layer looks like with a full file system)..."

            1 Reply Last reply Reply Quote 1
            • 1
              1337
              last edited by 1337

              What you want to do is to test by copying big files (20GB+ for instance). On the hypervisor directly if possible. It will take the cache out of the equation. Even consider rebooting on live linux USB stick and testing there.

              Your SSD array should have about 50% higher transfer rate in MB/s compared to the HDD array, all else being equal.

              Maybe you have a network issue, for instance drivers. Or vmware is using compression on vMotion and are starved for CPU on some server but not on others. Or, or, or...

              You have to do trouble shooting systematically so you can eliminate things.

              1 Reply Last reply Reply Quote 0
              • brandon220B
                brandon220
                last edited by

                I was looking at some specs on one of my machines and decided to look at the difference for a SSD and spinner. Pretty interesting... The IOPS difference is more than I would have guessed.
                bench.PNG

                1 1 Reply Last reply Reply Quote 1
                • 1
                  1337 @brandon220
                  last edited by 1337

                  @brandon220 said in RAID5 SSD Performance Expectations:

                  I was looking at some specs on one of my machines and decided to look at the difference for a SSD and spinner. Pretty interesting... The IOPS difference is more than I would have guessed.

                  Yes, and if you would've put in a NVMe enterprise SSD in the mix it would have been crazy. Expect 2000-3000MB/sec and 200-600 thousand IOPS - for a single drive.

                  scottalanmillerS 1 Reply Last reply Reply Quote 1
                  • scottalanmillerS
                    scottalanmiller @1337
                    last edited by

                    @Pete-S said in RAID5 SSD Performance Expectations:

                    @brandon220 said in RAID5 SSD Performance Expectations:

                    I was looking at some specs on one of my machines and decided to look at the difference for a SSD and spinner. Pretty interesting... The IOPS difference is more than I would have guessed.

                    Yes, and if you would've put in a NVMe enterprise SSD in the mix it would have been crazy. Expect 2000-3000MB/sec and 200-600 thousand IOPS - for a single drive.

                    Yeah, even my desktop drive circa 2014 was getting 50K IOPS.

                    1 1 Reply Last reply Reply Quote 0
                    • 1
                      1337 @scottalanmiller
                      last edited by

                      @scottalanmiller said in RAID5 SSD Performance Expectations:

                      @Pete-S said in RAID5 SSD Performance Expectations:

                      @brandon220 said in RAID5 SSD Performance Expectations:

                      I was looking at some specs on one of my machines and decided to look at the difference for a SSD and spinner. Pretty interesting... The IOPS difference is more than I would have guessed.

                      Yes, and if you would've put in a NVMe enterprise SSD in the mix it would have been crazy. Expect 2000-3000MB/sec and 200-600 thousand IOPS - for a single drive.

                      Yeah, even my desktop drive circa 2014 was getting 50K IOPS.

                      Enterprise SATA is in the 70-90K range today and I suspect it's the SATA interface holding them back.

                      Intel was pretty clear already a few years ago that they consider SATA and SAS SSDs to be legacy products. It's NVMe in it's different shapes and forms that is the current technology of choice.

                      scottalanmillerS 1 Reply Last reply Reply Quote 0
                      • scottalanmillerS
                        scottalanmiller @1337
                        last edited by

                        @Pete-S said in RAID5 SSD Performance Expectations:

                        Intel was pretty clear already a few years ago that they consider SATA and SAS SSDs to be legacy products.

                        Yeah, the interface and protocols are totally designed around the needs of spinning platters.

                        1 1 Reply Last reply Reply Quote 0
                        • 1
                          1337 @scottalanmiller
                          last edited by 1337

                          @scottalanmiller said in RAID5 SSD Performance Expectations:

                          @Pete-S said in RAID5 SSD Performance Expectations:

                          Intel was pretty clear already a few years ago that they consider SATA and SAS SSDs to be legacy products.

                          Yeah, the interface and protocols are totally designed around the needs of spinning platters.

                          Yeah and the latest figures on actual reliability in the field puts enterprise SSD way, way ahead of spinners.

                          I think we are getting close to the point where raid doesn't make sense anymore. Simply when a single drive will have both superior speed, superior reliability and enough capacity compared to what traditionally would call for a RAID array of HDDs.

                          Having a drive failure will become such an odd failure like having a raid controller, a motherboard or a CPU fail. You'd just replace it and restore the entire thing from backup.

                          jmooreJ scottalanmillerS 2 Replies Last reply Reply Quote 0
                          • jmooreJ
                            jmoore @1337
                            last edited by

                            @Pete-S said in RAID5 SSD Performance Expectations:

                            I think we are getting close to the point where raid doesn't make sense anymore. Simply when a single drive will have both superior speed, superior reliability and enough capacity compared to what traditionally would call for a RAID array of HDDs.

                            We have been there. Ive got a raid 0 in my work desktop for last two years and it really makes no difference. These are sata drives and not even nvme.

                            1 Reply Last reply Reply Quote 1
                            • scottalanmillerS
                              scottalanmiller @1337
                              last edited by

                              @Pete-S said in RAID5 SSD Performance Expectations:

                              Having a drive failure will become such an odd failure like having a raid controller, a motherboard or a CPU fail. You'd just replace it and restore the entire thing from backup.

                              I think drives already fail less than RAID controllers. From working in giant environmnts, the thing that fails more than mobos or CPUs is RAM. That's the worst one as it does the most damage and is hard to mitigate.

                              The difference though is that mobo, controllers, PSUs, are stateless to the system but drives are stateful. So their failure has a different type of impact, regardless of frequency.

                              1 1 Reply Last reply Reply Quote 1
                              • 1
                                1337 @scottalanmiller
                                last edited by

                                @scottalanmiller said in RAID5 SSD Performance Expectations:

                                @Pete-S said in RAID5 SSD Performance Expectations:

                                Having a drive failure will become such an odd failure like having a raid controller, a motherboard or a CPU fail. You'd just replace it and restore the entire thing from backup.

                                I think drives already fail less than RAID controllers. From working in giant environmnts, the thing that fails more than mobos or CPUs is RAM. That's the worst one as it does the most damage and is hard to mitigate.

                                The difference though is that mobo, controllers, PSUs, are stateless to the system but drives are stateful. So their failure has a different type of impact, regardless of frequency.

                                Well, the stateful-ness of the drives is not something we can count fully on, hence the saying "raid is not backup".

                                What I'm proposing is that when it becomes very unlikely that a drive fails we could rethink our strategy and go for single drives instead of raid arrays. In the very unlikely event that a failure did occur, we are restoring from backup, which we are prepared to do anyway.

                                With HDDs the failure rate is too high but with enterprise SSDs it's starting to get into the "will not fail" category.

                                1 1 Reply Last reply Reply Quote -1
                                • 1
                                  1337 @1337
                                  last edited by 1337

                                  @Pete-S said in RAID5 SSD Performance Expectations:

                                  @scottalanmiller said in RAID5 SSD Performance Expectations:

                                  @Pete-S said in RAID5 SSD Performance Expectations:

                                  Having a drive failure will become such an odd failure like having a raid controller, a motherboard or a CPU fail. You'd just replace it and restore the entire thing from backup.

                                  I think drives already fail less than RAID controllers. From working in giant environmnts, the thing that fails more than mobos or CPUs is RAM. That's the worst one as it does the most damage and is hard to mitigate.

                                  The difference though is that mobo, controllers, PSUs, are stateless to the system but drives are stateful. So their failure has a different type of impact, regardless of frequency.

                                  Well, the stateful-ness of the drives is not something we can count fully on, hence the saying "raid is not backup".

                                  What I'm proposing is that when it becomes very unlikely that a drive fails we could rethink our strategy and go for single drives instead of raid arrays. In the very unlikely event that a failure did occur, we are restoring from backup, which we are prepared to do anyway.

                                  With HDDs the failure rate is too high but with enterprise SSDs it's starting to get into the "will not fail" category.

                                  As an example assume we have 4 servers with a RAID10 array of 4 x 2TB drives each. Annual failure rate of HDDs are a few percent, say 3% for arguments sake. With 16 drives in total, every year there is about 50% chance that a drive will fail. So over the lifespan of the servers it's very likely that we will see one or more drive failures.

                                  Now assume the same 4 servers with a single enterprise 4TB NVMe drive in each. Annual failure rate is 0.4% (actual number a few years back). With 4 drives in total, every year there is less than 2% chance that any drive will fail. So over the lifespan of the server it's very unlikely that we will ever see a drive failure at all. Sure, if it does happen anyway, we are restoring from backup instead of rebuilding the array.

                                  1 Reply Last reply Reply Quote -1
                                  • B
                                    biggen
                                    last edited by biggen

                                    @Pete-S said in RAID5 SSD Performance Expectations:

                                    @Pete-S said in RAID5 SSD Performance Expectations:

                                    @scottalanmiller said in RAID5 SSD Performance Expectations:

                                    @Pete-S said in RAID5 SSD Performance Expectations:

                                    Having a drive failure will become such an odd failure like having a raid controller, a motherboard or a CPU fail. You'd just replace it and restore the entire thing from backup.

                                    I think drives already fail less than RAID controllers. From working in giant environmnts, the thing that fails more than mobos or CPUs is RAM. That's the worst one as it does the most damage and is hard to mitigate.

                                    The difference though is that mobo, controllers, PSUs, are stateless to the system but drives are stateful. So their failure has a different type of impact, regardless of frequency.

                                    Well, the stateful-ness of the drives is not something we can count fully on, hence the saying "raid is not backup".

                                    What I'm proposing is that when it becomes very unlikely that a drive fails we could rethink our strategy and go for single drives instead of raid arrays. In the very unlikely event that a failure did occur, we are restoring from backup, which we are prepared to do anyway.

                                    With HDDs the failure rate is too high but with enterprise SSDs it's starting to get into the "will not fail" category.

                                    As an example assume we have 4 servers with a RAID10 array of 4 x 2TB drives each. Annual failure rate of HDDs are a few percent, say 3% for arguments sake. With 16 drives in total, every year there is about 50% chance that a drive will fail. So over the lifespan of the servers it's very likely that we will see one or more drive failures.

                                    Now assume the same 4 servers with a single enterprise 4TB NVMe drive in each. Annual failure rate is 0.4% (actual number a few years back). With 4 drives in total, every year there is less than 2% chance that any drive will fail. So over the lifespan of the server it's very unlikely that we will ever see a drive failure at all. Sure, if it does happen anyway, we are restoring from backup instead of rebuilding the array.

                                    As long as you can justify the downtime in the event that a single drive failure takes an entire server down (albeit with a low statistical chance).

                                    If that isn't a concern no use running RAID anyway.

                                    1 scottalanmillerS 2 Replies Last reply Reply Quote 1
                                    • 1
                                      1337 @biggen
                                      last edited by 1337

                                      @biggen said in RAID5 SSD Performance Expectations:

                                      @Pete-S said in RAID5 SSD Performance Expectations:

                                      @Pete-S said in RAID5 SSD Performance Expectations:

                                      @scottalanmiller said in RAID5 SSD Performance Expectations:

                                      @Pete-S said in RAID5 SSD Performance Expectations:

                                      Having a drive failure will become such an odd failure like having a raid controller, a motherboard or a CPU fail. You'd just replace it and restore the entire thing from backup.

                                      I think drives already fail less than RAID controllers. From working in giant environmnts, the thing that fails more than mobos or CPUs is RAM. That's the worst one as it does the most damage and is hard to mitigate.

                                      The difference though is that mobo, controllers, PSUs, are stateless to the system but drives are stateful. So their failure has a different type of impact, regardless of frequency.

                                      Well, the stateful-ness of the drives is not something we can count fully on, hence the saying "raid is not backup".

                                      What I'm proposing is that when it becomes very unlikely that a drive fails we could rethink our strategy and go for single drives instead of raid arrays. In the very unlikely event that a failure did occur, we are restoring from backup, which we are prepared to do anyway.

                                      With HDDs the failure rate is too high but with enterprise SSDs it's starting to get into the "will not fail" category.

                                      As an example assume we have 4 servers with a RAID10 array of 4 x 2TB drives each. Annual failure rate of HDDs are a few percent, say 3% for arguments sake. With 16 drives in total, every year there is about 50% chance that a drive will fail. So over the lifespan of the servers it's very likely that we will see one or more drive failures.

                                      Now assume the same 4 servers with a single enterprise 4TB NVMe drive in each. Annual failure rate is 0.4% (actual number a few years back). With 4 drives in total, every year there is less than 2% chance that any drive will fail. So over the lifespan of the server it's very unlikely that we will ever see a drive failure at all. Sure, if it does happen anyway, we are restoring from backup instead of rebuilding the array.

                                      As long as you can justify the downtime in the event that a single drive failure takes an entire server down (albeit with a low statistical chance).

                                      If that isn't a concern no use running RAID anyway.

                                      That makes sense. But regardless of RAID or not, there are always things that can take the entire server down, for instance a motherboard failure. So that is something that is always there.

                                      I think you can take the probability x downtime to get the average downtime. And that times the cost per hour if you want to put it in $$$.

                                      So if something is 2% likely to happen and causes 10 hours of downtime, you get 0.2 hours (12 minutes) of downtime on average. If that downtime is going to cost $10K per hour then it's $2K.

                                      If that downtime is unacceptable you need to have more servers or more reliable servers. 12 minutes of downtime per year is 99.997% availability. 10 hours of downtime per year is 99.8%.

                                      scottalanmillerS 1 Reply Last reply Reply Quote 1
                                      • scottalanmillerS
                                        scottalanmiller @biggen
                                        last edited by

                                        @biggen said in RAID5 SSD Performance Expectations:

                                        As long as you can justify the downtime in the event that a single drive failure takes an entire server down (albeit with a low statistical chance).

                                        In business it is rare, but possible, that it is the downtime that matters. It's the dataloss. If losing a few hours of data will cripple you to the tune of millions of dollars, for example, then you do things to protect the dataloss "since backup".

                                        1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller @1337
                                          last edited by

                                          @Pete-S said in RAID5 SSD Performance Expectations:

                                          That makes sense. But regardless of RAID or not, there are always things that can take the entire server down, for instance a motherboard failure. So that is something that is always there.

                                          Hence my point about controller rates. In our giant environment on Wall St. RAID controller failures were the top cause of downtime, then RAM, then mobos. PSUs and drives failed more often, but were hot swap and almost never turned into downtime.

                                          1 Reply Last reply Reply Quote 1
                                          • zachary715Z
                                            zachary715
                                            last edited by

                                            Quick update, I modified Server 2 with the SSDs RAID cache policy from Write Through to Write Back, and No Read Ahead to Read Ahead. This appears to have made a drastic improvement as 55GB Windows VM live vMotions to Server 2 are now being completed in about 1 1/2 minutes vs 4 minutes previously, and the network monitor is showing performance on par with what I was seeing on Server 3. Now on to getting all 3 servers in direct connect mode for vMotion and backups over 10Gb/s. Thanks.

                                            ObsolesceO 1 Reply Last reply Reply Quote 1
                                            • 1
                                            • 2
                                            • 3
                                            • 2 / 3
                                            • First post
                                              Last post