Open Storage Solutions at SpiceWorld 2010
- 
 Another one from Viddler that needed to be rescued. This is one of my favourites, this is my presentation on understanding storage fundamentals at SpiceWorld 2010. This is the famous "cardboard boxes and duct tape" talk that got a lot of attention. 
- 
 Say, that's a really nice unavailable video you have there. 
- 
 I cannot view  
  
- 
 Video loads for me. I'm still waiting for my session to be posted. Might reach out to see if I can get a copy. 
- 
 @art_of_shred said: Say, that's a really nice unavailable video you have there. It's there. Google was still processing. 
- 
 @ajstringham said: Video loads for me. I'm still waiting for my session to be posted. Might reach out to see if I can get a copy. They lost most of them. None of NTGs made it last I knew. They were all posted long ago. 
- 
 Transcript: so my name is Scott Alan Miller that's 
 all I'll tell you about me because
 honestly no one actually cares and so if
 I talk about myself I'm boring and
 narcissistic and one of the rules of
 presentations is it could be boring it
 could be narcissistic people so this
 talk is on open storage I think it's
 called open storage landscapes and I
 think there's a lot of no one's quite
 sure what I'm gonna be talking about
 what I want to do here is not look at
 specific technology implementations or
 some real specifics but my goal today is
 to have you come away with a different
 vision a different appreciation for
 storage and the storage landscape and
 and exactly what your options are and
 how to approach thinking about storage
 in general when it's oranje that is a
 major problem for all companies today
 whether you're you know a tiny shop or
 you're a giant enterprise that sort of
 expensive storage is critical in reality
 nothing is more important in your
 business than your storage if you lose
 everything from a processing perspective
 and save your storage or your voice then
 you're recoverable but if you lose all
 your storage and keep your processing
 you're screwed
 so so storage is very very important but
 we have a very we tend to look at it in
 very strange ways kind of forgetting how
 storage actually works under the hood so
 I kind of want to start off by going
 through some real basics of storage and
 try to present it in a little bit of an
 interesting way and then I want to talk
 about the concepts of open storage and
 how that applies to to your businesses
 and some reference implementations of
 how that might be something that you
 would want to do and when that might
 make sense because my format is very
 very loose I don't work with slides
 because they're distracting and I can
 kind of scale the speed very easily so
 I'm gonna be a lot more open to people
 asking questions as we go try to keep it
 kind of related to where I am but if you
 want to jump in and it's kind of
 ladies will tackle it as we go because I
 could very easily you know miss
 something that's valuable for everybody
 so alright so in the beginning right
 when when computers first began the very
 first thing we needed was well we have
 this computer so we need to store stuff
 so the very first thing that we got was
 the hard drive I realize this is not a
 bit hard drive the hard drive was simple
 right has one spindle it's store stuff
 everybody understood that right very
 very easy
 there's nothing complex it's a physical
 device and we attach it to a computer
 so this is our server
 we refer to servers as boxes we can
 demonstrate them as
 so what we did with it was our very
 first server the first server because we
 took this hard drive the first hard
 drive ever
 and we literally took a cable that
 actually looked like that babe right I'm
 stuck inside we're now directly attached
 one drive directly in Tetris get
 attached inside your server the
 connection technology was very very
 simple right we know today is scuzzy yes
 there's something before it but
 basically scuzzy was the beginning of
 real storage it is a generally
 simplistic technology but it does have
 the ability to dress but to address more
 than one drive so we started putting
 more drives in service once we had more
 than one driving service they said well
 this is a problem so we took that drive
 when we said this is a point of failure
 we lose this drive we have nothing
 obviously can have backups hopefully
 you're following Daniels advice and have
 made good backups and you can get a new
 hard drive and keep going but we're also
 concerned about not just recovery which
 is where backup comes into play but
 continuity of business which is where
 with redundancy come from the place so
 in order to get redundancy we said well
 you know what if we had two drives which
 we already had we had multiple drives
 and we said well what if we built an
 abstraction layer and put these together
 half the entertainment just watching it
 works doctor
 so we take these hard drives and we
 literally put them together in a way
 that for an outside observer it's one
 drive hands bigger but it's the same
 thing right we call this rate so in this
 case it's raid one so these drives in my
 example are mirrored to each other but
 not important so the key element here is
 that when we use raid we are taking
 multiple independent devices and we are
 attaching them together in such a way
 that when we then attach it to the
 computer alright and there's a little
 bit more than duct tape involved here
 imagine this duct tape includes a RAID
 controller
 right and that RAID controller provides
 that abstraction layer so that your
 computer when it sees this block of
 storage right it's still connected using
 the same technology that the single
 drive was connected with and is
 connected in the same way and when the
 server when the operating system looks
 at your raid array it is looking at one
 drive it has no idea how many physical
 devices are in play it believes that
 there is a single logical drive so this
 was the first step of making enterprise
 storage right and this was a pretty good
 step once we got to this point we could
 really do business with servers so what
 happened from here was we started
 getting a lot of these right oh that one
 fell out
 don't bite from that manufacturer and
 and we said well we don't have a lot of
 spaces in these boxes they get hot they
 fill up with other things we want to
 have a lot more than two drives I don't
 have two drives my example but what if
 we had to eat and drive right what if we
 wanted to do that well why not put these
 same drives outside the box it's not
 where the term came from
 so we came up with the ability to
 literally take skuzzy same technology
 nothing's changed we changed the cable
 so there's a little bit more resilient
 to our interference and we attached
 drive outside of the box this is where
 the term direct attached storage came
 from kind of denote that this is what
 we're doing as far as the server is
 concerned they're still drives inside
 the computer
 nothing has changed to the servers
 perspective it's just literally they're
 outside the case rather than inside the
 case you can do this with a running
 server pop off the top yank a car drive
 and as far as the cable will go you can
 just dangle your ear drive but basically
 what's happening in a physically managed
 way so that you're not catching fire
 shorting something out and this gave us
 the next step of storage we were able to
 do more advanced things but nothing has
 changed as far as the server's concerned
 it has one drive that's it and it's
 everything is operating the same so then
 at some point we said well I like how
 this looks like scuzzy cable works out
 really well so the later we said and
 this could be anything right so early on
 OA scuzzy but now there's lots of
 options for this today we've got sass
 we've got SATA we've got USB we have
 firewire we have Fiber Channel right
 lots of options but they all do almost
 exactly the same thing which is very
 little
 so with the more advanced of these
 technologies right we started saying
 well we could instead of simply
 attaching the drives to the server in a
 way that the it's really really
 physically directly attached that's not
 necessarily useful we might want to have
 our drives farther away well if we're
 gonna do that we're going to need a
 little bit more advanced communication
 so scuzzy is traditionally it's very
 much like I love this example it's
 basically like Ethernet not switched
 Ethernet that we use today
 it's like Ethernet long cable vampire
 taps everything communicates at the same
 time interrupting if each other so
 scuzzy was kind of inefficient and it
 really wasn't going to scale so we moved
 to different technologies for large
 implementations
 and the first thing that really came
 into play was fiber champ and the thing
 that makes fiber channel different than
 scuzzy is that scuzzy is really a non
 lair to networking connection right it
 doesn't really have what we think of as
 Ethernet addressing style machine level
 addressing with fiber channel we have
 machine level addressing so it's a full
 layer 2 networks protocol which also USB
 and firewire are just not on the same
 scale so when we did that we said oh
 well now we can switch these connections
 and I don't have all the cabling here
 that it would be necessary to
 demonstrate this but what's interesting
 is suddenly we could take all these
 drives just like we had before and sit
 them in a box and put a bunch of servers
 over here and just have a switch in the
 middle and connect the drives and
 connect the servers and using the switch
 we can tell which ones connected to
 which and what we have is rudimentary
 Sam right and the most simple devices
 you can get a hold of like your neck
 your SC 101 is literally two drives in a
 little plastic container with the
 smallest chip you've ever seen and the
 only thing it does is act as a very very
 simplistic drive controller and puts zzn
 protocol on to the wire and that's it no
 logic no nothing it is basically a
 network card and you combine Network
 written small network cards that attach
 directly - I'm sure they exist for sass
 but I've seen them for old-school
 parallel data plug them in and you
 literally have a network connection on
 the drive nothing else no raid no
 management no anything the computer
 still sees the block device that it saw
 when it's plugged in internally plugged
 in externally plugged in through fibre
 channel switch whatever so nothing has
 changed as far as the computers
 concerned
 so what we started doing then right is
 we said well now these drives are out
 here this was kind of an experimental
 stage this wasn't a stage that lasted
 very long with all these old drives in
 fact it said well if we took all these
 drives and we put them into another box
 another server and instead of being
 rudimentary like the SC 101 we've got a
 big box with an eye specifically got a
 larger box so that we could hold lots of
 drives right this box might end up being
 several racks in size we can hold
 hundreds of draws thousands of drives in
 a single box and now we have all these
 abstractions that already exists we have
 the r8 abstraction that allows this
 machine to remember this as a server to
 see those drives as a single drive and
 we have networking technologies like
 fibre channel that allow this to share
 out those drives as block devices to any
 device that wants to read a block device
 right so these are acting the same as
 any other block device you're going to
 see right whether it's we think of
 storage as its own thing but storage
 drives networking they're all blocked
 devices to the system so they're all
 acting exactly the same it's just the
 way that we look at them so we get our I
 don't have the switch here so just
 imagine they're switching
 okay
 so we have our fiber channel connection
 we're gonna attach it from this server
 which has all the drives over to this
 server which is where we want the data
 so yes
 yes
 yes correct so this is a server that we
 have made purposed to storage and this
 is a server that's doing you know who
 knows what the normal stuff probably
 with no Drive or could have its own
 drives plus these struts you can mix and
 match so now we have dedicated devices
 but as far as this server is concerned
 it still sees a scuzzy connection a
 fiber channel connection whatever so
 we're still dealing with the the same
 technology this is what's interesting is
 that we start to think oh weird things
 are happening but as far as the server
 is concerned nothing has happened it's
 still just that drive connected directly
 so then after this stage right we said
 okay so we're going to introduce another
 abstraction layer to the network and
 we're going to pump fibre channel over
 tcp/ip so we introduced ice Guzzi which
 is the fiber channels goes here
 basically the same protocols very very
 close I scuzzy is simply taking the
 original scuffie protocol and
 encapsulating it in tcp/ip so that I
 scuzzy leaves this box
 I'm sorry scuzzy leaves this box goes
 into an encapsulation layer that allows
 scuzzy be transported over tcp/ip and
 then it gets taken out on this box and
 its back to scuzzy so it's a device
 layer it is still seeing a single scuzzy
 drive connected nothing has changed so
 that really brings us to modern storage
 as far as a block level storage is
 concerned I'm trying to avoid the
 obvious words that every knows I'm
 talking about so this is a server normal
 server which is very very important it's
 a normal storage server this is a normal
 computing device and everything is
 viewed as block devices if we take this
 server and we then apply an interface to
 it that makes it very very easy to
 manage it makes it no longer look like a
 normal server but look like a dedicated
 storage server take away the options of
 processing on it add in some
 perhaps ease of management type things
 we then call it appliance sized and we
 refer to it as a sand but there's no
 difference to a sand from a storage
 server except that it has an interface
 that makes it work not like the regular
 server that's so so this is the really
 interesting point we have all this
 processing power here and so if we then
 let this server not just share out the
 lock level devices without touching them
 with you know when we're dealing with
 this at this level this machine is
 effectively dumb it doesn't know what's
 on the disks it doesn't know how to use
 them it doesn't know how to read them it
 can have if this could be a window
 server and those drives that are in it
 could be loaded with Linux file system
 of Solaris file systems it can't read
 them they can't do anything with them
 but it can share them out because it's
 purely a block level device if we then
 add more power to this box more power
 from a software perspective and said
 okay we're gonna make it in this box
 able to read these discs and understand
 them then we can start getting even more
 power because we can start doing things
 here without sending it over there first
 so we can add new protocols onto this
 box that give us not block level sharing
 we can still have that this is all still
 going on but this box now can read these
 drives right there this is new layer so
 we add in a protocol that allows this to
 share a file system layer instead of the
 block layer and to do this obviously
 this machine has to understand it too we
 need to be able to put it onto the
 network so specific filesystem
 abstractions are made for this and we
 know them today is NFS sips AFS these
 are popular protocols for this what
 makes them different than blooded block
 level is that at the file system layer
 this device can determine if there's
 been changes to a file and only Sandover
 changes it can make changes
 you can do all kinds of things including
 that's very very important
 it can lock single file from you which
 means that this server is contacting
 this box and wants to write to a file
 this server can lock that file and say
 no one else can write to it which means
 it for the first time we have a means of
 having this connection go to more than
 one server you can do this with block
 level devices back in the early days
 when we had literally hard drives inside
 servers people would actually take this
 scuzzy cable hook one end into one
 server one hand into another server and
 dangle a hard drive off the middle anak
 would cause obviously disaster you have
 to drive controllers it's like having
 two steering wheels in the car and two
 people driving without being able to see
 each other or talk right one person
 wants go this way one person's go that
 way one first the fitment gas the
 person's hitting the brake
 you know deer runs out in the road each
 one thinks a different directions way to
 go you're gonna have a disaster and
 that's what servers do if there's two
 servers talking to a single hard drive
 not aware of each other so there are
 specific file systems that were designed
 to be able to handle that where but each
 server had to play nice there's no
 gatekeeper so any server that decided to
 mess with the data was going to and it
 could make changes that the other one
 didn't know about it could delete files
 that the other one tried to protect they
 could read files the other one said it
 shouldn't be allowed to read there's no
 way to control those things so there was
 a gatekeeper when we're dealing with
 file system level sharing we have that
 gatekeeper we have the security at this
 level where we control it we don't have
 an open connection somewhere that
 someone can do anything they can get
 access to so at this point most people
 know that a device that's doing this is
 called a file server if we then in the
 same manner of taking this storage
 server adding an abstraction layer to it
 so that it looks like a non-standard
 server we call it ass and we do the same
 thing with file a little storage and we
 call it a mass at no point is this not a
 traditional file server it is simply
 applying sized so that it looks like a
 different device and takes
 some of the options of doing some
 general things so the reason that I
 wanted to run through that is because
 quite often when dealing with Santa Ness
 we actually think of the world in terms
 of fan of that we say well we have
 storage right should I have santur Nass
 and in reality those aren't those aren't
 really the questions we're asking what
 we should be asking is do we need
 block-level storage or do we need files
 at the lowest or and that's the big
 question if you're running a database
 and it needs to be able to interface
 directly with the devices because it
 does something really complex which is
 basically have its own file system and
 ignore the actual so you know if you're
 running IBM db2 IBM db2 to these raw
 devices supports discs because it has
 its own file system that it's only for
 the database has no other purpose so it
 has to have block level access so they
 can do anything at one for the drive
 head but if you're dealing with normal
 file sharing right Word documents Excel
 documents all the stuff that users have
 piled up everywhere yes you can do that
 at the block level and attach it to a
 single device yes you can go get really
 expensive commercial file systems that
 allow you to share that out like Oh F Oh
 CF from pork right and GFS too from Red
 Hat but you're getting into running big
 UNIX boxes filled with do those things
 so not really effective for that but if
 you're running sis or NFS you can
 connect all kinds of desktops to it you
 can you know all the things that you
 know how to do you can do so choosing
 block storage or file system level
 search is really really question and at
 the end of the day you have a file
 server one way or another that's doing
 that work so at that point I'm just
 gonna let people ask questions now and
 they have prompt so I'm not sure if
 they're like falling asleep for has
 questions or
 yep
 yes okay
 actually what's funny is typically mass
 cost about 150 to 200 dollars and Sam
 starts at under a hundred it's and
 started close to 30 but it's not the
 same people think of they think of Santa
 thick the fan is actually a little one
 so the names are actually awful right
 network-attached storage and storage
 area network
 I hate these terms partially because
 they're confusing FRC because they're
 wrong storage area network is really
 just a term for block level storage and
 network attached storage is really just
 the name for filesystem level storage so
 when you're doing a block level you have
 to attach as a device and when you're
 doing filesystem level you have to have
 a special virtual file system driver
 like sis or NFS that allows you to put
 it onto the network and share it over
 the normal network
 the idea was Sam in theory the reason
 the word exists is because when they
 first did it with fiber channel fibre
 channel was its own thing you didn't
 communicate over fibre channel for other
 tasks so it was a storage area network
 but then later very quickly and actually
 at the time as well people would take
 nass devices file servers because we
 didn't call them NASA back then and put
 them on a dedicated network with its own
 switch connected to the servers on
 dedicated NICs just for storage well
 obviously that's a storage area network
 using Nass so that the terms can overlap
 even when you're doing dedicated
 networks so that's why I hate the terms
 but when we say Sam and we sometimes say
 San protocols doing Fiber Channel I
 scuzzy s CEO ata those things all right
 SC o Ethernet sorry
 and we use Nass to refer to the SIS NFS
 AFS today after that okay cool
 anything else before
 in the back
 yes a a SAN is like a LAN in that it is
 a network yes
 yep and I worked at a company that
 actually had what we called the D bat it
 was a database area network it was a
 dedicated network in the same way as her
 storage area network except it was
 dedicated to database communications so
 all the databases were on here
 Ethernet switch Ethernet service so who
 basically did the same thing really well
 so there was question up here
 I scuzzy is not as I'm aware any more
 noisy then I'm assuming they're there
 thinking it's broadcasting I'm not aware
 that there being any significant
 broadcast traffic unless they weren't
 running switched Ethernet which would be
 the bigger problem
 yeah it's um no it's it's really it's
 the it's DCP and it's non broadcast so
 it's point-to-point same as any any
 other for exalt that nature you know
 yeah
 it could be actually that's really good
 point so for a customer like that
 regardless of the noisiness of the
 protocol when you have traditional
 Sandpiper Channel will call traditional
 your network is obviously completely
 dedicated to that
 but what's highly recommended is if
 you're going to do I scuzzy or any other
 you know sand that leverages commodity
 networking as we like to say so Ethernet
 you want dedicated Hardware and
 dedicated network for that even so
 because it's it's not necessarily noisy
 but it is a really high volume
 traditionally I mean you wouldn't bother
 with it if it wasn't a certain life so
 you want to have switches that aren't
 doing anything but that in the same way
 you would have done a fiber channel just
 because you switch to I scuzzy from
 fiber channel doesn't mean that you
 should leverage your traditional land
 that's already existing to do so because
 you still have the same volume concern
 you just want to put it on a dedicated
 switch and treat it in the same manner
 which is nice because when we move to I
 scuzzy verses fiber channel you can't
 afford to go faster you can afford to
 have more redundancy you can get better
 switches keeper and quite often it's
 really really popular the more important
 something is Frank your shortage area
 networks the most important thing in
 your business it's really really common
 to jump straight to we need layer 3
 switches we need lots and lots of
 management we need that the biggest less
 expensive things we can find the reality
 is you probably don't care that much
 about monitoring your surgery Network
 you might but most of the time you're
 actually gonna be better served getting
 a cheaper unmanaged switch because what
 you care about is latency and throughput
 and so the the less management there is
 the less layers there are the less
 that's going on you don't wanna be v
 landing on the switch that's handling
 your eyes Guzzi you want it to be its
 own not not virtual LAN you want to be a
 LAN
 if you need to be LAN get another switch
 and use it physically actually segmented
 because you don't want that overhead on
 your storage because your storage
 latency matters everything in your
 company
 so you want that to be as fast as
 possible and which is actually cheap
 that's the wonderful thing about speed
 on Ethernet pretty much the cheaper you
 get not consumer devices but with him
 within a given product range generally
 the cheaper they are the faster they are
 anyone else before going
 so you're talking about what where I
 said there there are file systems that
 allow you to do that commercial positive
 them okay
 I believe they're simply generally
 referred to as shared access file
 systems
 maybe someone actually knows what the
 generic term is for that family of but I
 do know that Red Hat and Oracle are the
 key players with that and I believe the
 Veritas with the XFS does that as well
 but I'm not a hundred percent sure I'm
 definitely not an expert on TX of that
 but GFS too is Red Hat's product so if
 you look up go to Red Hat look at GFS -
 I believe it's actually available for
 free from them but it's a dedicated
 these are file systems
 so anything that attaches to that with
 Sam has to have that file system driver
 so you have to look for a vendor that's
 gonna support whatever devices you're
 using but yep that's the big one
 but I don't work we'll have one - all
 right all right okay so
 a lot of people are familiar in the
 community we've talked about the Sam SP
 right the Sam St which I did not name is
 not an actual product but it is what I
 like to call a reference implementation
 of open storage and the reason that we
 came across with this is because in a
 lot of conversations we talked about
 with with companies and with people in
 community you know people want out well
 you want to put in the sand right so
 they go to a vendor
 everybody's gotta stay on these days
 they go to the vendor and they say well
 I need well you know I need block-level
 storage so they come out with really
 really expensive storage products and if
 you're a giant fortune 500 it probably
 makes sense because your storage devices
 right when I worked at the bank our
 storage devices are in the petabytes
 right they have entire data centers
 dedicated to the storage and we have an
 OSI 192 running to other facilities to
 carry those the fibre channel / - right
 so we can lose an entire data center our
 storage is still there and that sorta
 just replicated to another data center
 over OC 192 so unbelievable amounts of
 storage there's no way you're gonna go
 build this at home yourself
 that's where players like EMC Clarion
 come in and and Hitachi and build entire
 rooms that makes sense but when you're
 looking at more reasonable numbers of
 storage you start getting into the space
 where you're using traditional
 technologies completely including
 chassis so that's really what matters
 here so I'm gonna give a little story on
 this this is kind of the back story of
 how the Santa Fe came and came into
 being I work for where I'm a consultant
 for of a major fortune 10 it's hard to
 be a minor fortune-telling foot and they
 were doing a global commodity grid you
 can read about it online we're well
 known they were over 10,000 notes and we
 push an unbelievable amount of computing
 through that lots of different lines of
 businesses use it a lot of people like
 to call cloud it is not its
 high-performance grid computing it's
 very related to clouds that's not the
 same thing but we so it said it's an
 application layer virtualization not a
 operating system layer virtualization
 r500 realization so that's kind of where
 those government
 so we run several dozen and maybe a few
 score applications on this ten thousand
 note grid to be able to back that grid
 we don't have any storage on the nodes
 except for the operating system just
 makes it easy they boot locally but all
 their data comes in from somewhere else
 and then gets saved somewhere else we do
 cache locally just approximately on this
 we were working with we won't name names
 but a very very very major storage
 appliance member we had the second
 largest product that they made it costs
 really close to three hundred thousand
 dollars per unit we worked with them we
 brought up a new part of our grid and
 the load demand on the grid turned out
 to be higher than this device could
 supply not from a throughput necessarily
 actually from an eye off standpoint it
 just couldn't handle it with the
 spindles ahead and so we approached some
 vendors and at the time another vendor
 in the server space had brought out I
 guess I'll name it son had brought out
 what they called thumper which is a 48
 drive for you server to processors 48
 drives for you for you chassis
 it's a traditional chassis you go to
 your data center it looks like a regular
 for you server nothing weird it just has
 a lot of drive bays and they were
 pushing this as a sort of think of this
 in retro term let's go back to old
 storage stop thinking that they actually
 this is where the term open storage came
 from when they really suffer son said it
 is time to rethink storage storage
 devices that everyone's been buying
 Sanon ass are just servers that have
 lots and lots of drives attached to them
 well why not just buy a normal server
 and use that because when we make normal
 servers we can make lots of them faster
 than your price goes way way
 now when you buy sand and NASA labeled
 devices you tend to get products that
 are not sold on the same types of
 quantities as commodity servers and
 sometimes you use proprietary software
 to do some of the cool features and this
 drives the price through the roof
 they are also considered a non commodity
 so their margins are much higher the
 margins on a traditional server you're
 looking at the major players you know HP
 Dell whatever Dell does not make a
 thousand dollars off every server
 yourself is up
 they make twenty dollars right when you
 buy the time you get all the discounts
 done their margins are low so they're
 not ripping off on a server there it
 cost them a lot to deliver that to you
 so you want to be buying those devices
 if you can help it because that's where
 your value is coming from when you go to
 appliance size products you generally
 have to pay a lot for the main master
 the name sent and so so what Sun did
 something actually came in and worked
 with us and they knew they weren't
 getting this account but they worked
 with us anyway because they hated the
 other vendor we were competing against
 and and we said to them we we really
 feel that this this device that we have
 is very very expensive and doesn't seem
 to be doing as much as we could just do
 with a regular file server and son said
 absolutely a regular file server gives
 you all these advantages there's the
 commodity you can tune the operating
 system you can pick the operating system
 you can pick every component and you can
 do it much cheaper and they actually
 flew in the guy who invented ZFS to talk
 to us about it it was awesome and so we
 said well we went to the client and we
 said we would like to do an engineering
 study and we want the storage vendor
 involved I said ok they ordered the
 largest product that they made the
 largest mass device on the market
 there's a couple years ago so it's this
 figure now this was the it was a half
 million dollars and it was installed and
 tuned by the storage vendors own
 engineers they had a lot of money
 because we weren't buying one we're
 looking to buy like it doesn't
 so they brought in a lot of resources to
 make sure this was going to beat
 anything we could do we took two people
 from the engineering team with a couple
 hours
 we took a commodity server that we had
 now is it was a large server at the time
 it was a four-way Opteron box but it
 would be considered a small server today
 it's probably about 1/3 the processing
 capacity of what you would get for run
 for $5,000 today so still a decent
 server but and that's the time pretty
 impressive but nothing if we just fold
 it remember joint we loaded Red Hat
 Linux on it no tuning we got normal
 storage nothing special we set it up
 with NFS which is exactly how they were
 connecting to the other box and we did
 and before we ran it we projected what
 was going to happen that we knew there
 were threading issues on the processing
 side of the storage vendors product
 because it was not an open box and they
 could not update their operating system
 for the latest kernel which they needed
 to do because they weren't making their
 own operating system they were getting
 it for another vendor and they didn't
 have time to rework the operating system
 and we had the ability to run the latest
 Red Hat which had great threading
 support which was needed to be able to
 push the I office and when we ran it not
 only did our at the time $20,000.00
 solution which cost about it was
 literally do you be able to put together
 for about two to three thousand today I
 expect not only do we outperform a half
 million dollar server turn tuned by
 their engineers but instead of flat
 telling and having them we actually have
 all the performance curves we have no
 idea what the capacity of the open
 scratch-built box was because we the
 grid could not request enough I off step
 pressure in the half-million-dollar
 device not only plateaued but when it
 went on the plateau for very long it
 actually shut down
 so the the potential savings here were
 not just millions of dollars of
 purchasing but that this product met the
 need while this product did not it was
 easier for us to manage because we
 didn't have to have dedicated storage
 people we use the skill set we already
 had we already have the engineers for
 this will just manage it along with all
 the other servers it will look exactly
 like all the other services so this
 experience and for those who wonder no
 they didn't go with that they went with
 the expensive solution in the project
 help the welcome to the fortune 10 so
 what that prompted was later when when
 Niagara started looking at storage
 options and we started having a lot of
 conversations in the community about how
 do you do storage how do you make it
 cost-effective what do you do when you
 have all these needs and needs
 flexibility and I can't afford these
 other products oh we looked at the
 product market and we said wow you know
 you go to any major server vendor right
 ones that are here ones that aren't
 anyone who's a top-tier vendor and they
 have these chassis that are very
 affordable that have a lot of space for
 desks some have more than others some
 have different price points but they're
 all relatively affordable and powerful
 stable and manageable and they fit into
 your infrastructure just like everything
 else you can go get third-party disks
 for them some support that a little bit
 better than others but most of them have
 complete open support for any gift you
 want into it you can put lots of disks
 into them you control their speed you
 control their configuration to control
 the models if you have a specific drive
 vendor that you're very very comfortable
 with you can pick them to get you all
 that and you're building systems for a
 few thousand dollars that not only might
 outperform a 30 or 40 or $100,000
 commercial appliance eyes sand or nest
 device but you also have more control
 over it you have
 the event this is the most important
 thing with any computing device remember
 it's just a server there's no magic
 right everybody thinks well I'm gonna
 get its and I can let everything else
 fails cuz the sand won't fail but the
 sand is just a server like everything
 else right there are better ones they're
 cheaper ones but it's just a server
 right it's always subject to the
 forklift risk right but someone's gonna
 drive the forklift into that one box and
 it just absolutely happens right from
 some real example and so when you cut
 the cost dramatically when $30,000 was
 it was a consideration but now you can
 do the same thing for $5,000 don't cut
 your budget by $25,000 cut your budget
 by $20,000 and get two of those and now
 you can now they can you can use them in
 the same way you do with anything we're
 done and that doesn't have to be on a
 scale - it could be on a scale of 50
 right you were gonna buy 50 or 25
 commercial sands now buy 50 of these and
 build things and that's that's an option
 when you get really really big right it
 starts to maybe not make sense really
 large sands have capacity to put lots
 more drives and and they're much more
 manageable and on a really massive scale
 so if there are price points and there
 are feature points where traditional
 sands start to make a lot of sense but
 they almost never do when you're in a
 capacity range where you are working
 with a single traditional commodity
 server chassis capacity as a lot of ways
 basically if you have a normal server
 you can buy off-the-shelf from your
 preferred vendor wherever you're working
 with now and once you're working with
 some white box builder and then stop and
 go get it enterprise bender either if
 you're dealing with an enterprise vendor
 go to them get their price for the
 chassis that makes sense it's almost
 always a to you I know
 Dell is here and they've got a box that
 holds 24 2.5 inch drives in a to you
 right pretty unbelievable
 if it's 24 2.5 inch drives meets your
 needs you've got all that storage and
 that it's that's potentially really fast
 sure
 well before I answer that exact question
 because this actually came up last night
 almost exactly the same thing right so
 when I talk about storage I often talk
 about Red Hat because that's how we do
 storage which is not actually true we do
 a little bit of Red Hat most of our
 storage is actually solaris cousins that
 are io throughput but in either those
 cases you're dealing with an operating
 system that chances are the majority of
 you whether that's 51 percent or 80
 percent I don't know but most people in
 this community are not unix proficient
 it's not part of your infrastructure
 it's not something you manage on a daily
 basis if it is it's definitely
 consideration but if it's not it doesn't
 matter because Windows is an absolutely
 wonderful storage platform in the same
 way that UNIX is and it's just in this
 example is that we ran UNIX because
 that's that's what we did were for doing
 administration windows make some really
 powerful storage stuff they do I scuzzy
 they do cess they do NFS it's all free
 it's all included you're not buying
 extra products and they're sips if
 you're doing Active Directory
 Integration it's by far the easiest to
 deal with works the best most reliable
 but if you don't want to go with Windows
 as your storage and you want to go with
 someone like Red Hat as an example you
 have lots of options even if you don't
 have the in-house expertise there are
 lots of MSPs who will do that obviously
 pretty much you can always find an
 atmosphere do something for you know
 what someone killed you find it but
 there really are your storage devices
 are something that need relatively
 little monitoring they need monitoring
 but you're not you're probably not
 concerned with you know capacity
 planning other than the amount of
 storage and you can watch that right
 Spiceworks will monitor it and tell you
 how much it's being used so as long as
 you're dealing with that kind of stuff
 you're probably not dealing with in a
 normal small business situation you know
 CPU capacity memory capacity you've got
 more than enough in the smallest box
 that those dozen companies like Red Hat
 if you actually get Red Hat commercially
 you will get support from Red Hat
 themselves right or if you're getting
 Susa from Novell they don't sell in the
 same way that Windows does Windows is
 based on a license and the commercial
 Linux players are based on support
 contracts so that support is okay or
 those I know a lot of people in here
 like I'm going to
 canonical
 the exact same thing for free play with
 it and when you're ready to go live you
 can contact canonical and yet torsional
 support directly comes em as a primary
 vendor or to any number of msps who
 would provide that support and of course
 you get training anything else that's
 your question okay
 I do have opinions on them I have run
 across them um I don't like their
 pricing structure I feel that their they
 are hurting themselves in that way I
 think they should have a free version
 that is more scalable but as a product
 based on open Solaris if it fits into
 your price range and it's not that
 expensive right it's a very if you're
 looking at this range of stuff this is I
 think it's a really good product I have
 not used it in the commercial capacity
 so there may be some gotcha so I'm not
 familiar with but the underlying
 technology it is open Solaris is awesome
 right lots and lots of power lots and
 lots of flexibility lots of options and
 very easy to manage and that's something
 I should mention is the attenti is a
 it's a NASA appliance right I can't
 believe I forgot some ends of it so we
 have traditional servers file services
 just you know windows a red hat or
 Solaris whatever and you're doing
 everything and then we have the
 appliance that's right you can go to
 neck here you can go to Buffalo you go
 to Drobo you can go to EMC and ecologic
 and HP everybody right everybody has
 these full appliances but there's also a
 middle ground or you're using the
 hardware like the commodity hardware
 from anybody and then applying the
 operating system that is a appliance
 operating system so next enta is a great
 example of that it's one that's built on
 open Solaris 3 NASA is the same type of
 thing completely free built on FreeBSD
 and open filer is the same thing that
 built on connery Linux which
 unfortunately is a very unpopular
 version of Linux and it's not monitored
 by anything and the valve management
 stuff is funky so that's unfortunate but
 and there is a fourth player and I can't
 remember their name he's definitely the
 small tear in that left hand from HP
 used to be one of those players and when
 they got bought by HP they kind of
 two combined hardware so they kind of
 moved into that side instead of being in
 the software space so but for people who
 want to work with Linux and don't want
 to work with Linux or want to look at
 BSD and don't know what the BSD those
 solutions give you those operating
 systems with those operating systems
 advantages and disadvantages without
 having to know those operating systems
 and one actual caveat to mention is if
 you're gonna work with open file are
 very powerful it is the best for
 replication of any product along with
 all the big commercial ones this
 replication is phenomenal but there's no
 interface for it you are you will need a
 senior Linux admin fees to do that with
 freedom the only couple hours we are in
 officially in the QA I think we have
 five minutes I need five minutes of
 questions
 ready now
 well so full disclosure company I work
 for is a partner with Netgear so we have
 to say we'd love ready now but we love
 ready now definitely it my personal
 preferences if you're gonna be working
 in the space where you want an appliance
 mask and you you know you want all the I
 just want to buy it right I don't want
 to be ready NASA's a really really great
 product is based on Linux it does not
 have dr BD replication we are pushing
 them for that that doesn't mean they'll
 do it but we have pushed for other
 things we are doing it so there are some
 caveats with readiness and I'm not
 allowed to tell you but so I'm not gonna
 mention what the caveat are but I can
 tell you since I didn't tell you what
 they are that they're going away in
 December so readiness it's a great
 product and we've priced it versus
 building a Sam st and it's within like
 10 percent of cost and there is someone
 on my team who runs one and was it yes
 I'm sorry Don
 I have I don't have experience on it so
 I'm I can't really compare it to
 anything unfortunately can't really
 answer that very well
 uh-huh
 that
 if you're getting the 24 bay from from
 Dell the 2.5 inch chances are if you're
 buying that unit it's because you want
 15 K drives chances are just because you
 selected that chassis that's like why
 that chassis exists you don't have to
 when you're choosing both your your raid
 levels right and everybody knows but
 mostly you know probably that I
 absolutely hate raid 5 but the reality
 is if you're in a archival situation and
 it's not a live system and it's backed
 up and all you want it to be is online
 most of time and you're willing to take
 a little bit higher risk raid 5 can save
 you a lot of money I would not use it
 for online systems but I would use it
 for Nearline which is kind of a lot of
 small businesses don't do near line
 storage so but when it comes to actually
 selecting your spindles it's really a
 question of price and versus I ops right
 and so if you know if you're gonna go
 with SATA
 you just have to have more of them but
 they cost less so that could be very
 beneficial and typically you're gonna
 get more storage while you do it so you
 might be like oh here's the option for
 SAS at 15k here's the option for SATA at
 7.2 K and at the price point where it
 gives you the performance you need this
 one likely is gonna give you 2 or 10
 times the actual storage capacity that
 might be a winner but it also might be
 so many doesn't fit in the chassis you
 want to get and there's a little looser
 so and as you have more drives they are
 more likely to fail right just 20 drives
 are more likely to fail them - so there
 are risks there but just doing a
 calculation of performance is really the
 only the only factor there there's no
 there's no guaranteed thing and a lot of
 commercial Santa nests are only SATA
 because they just add more of them
 well so with raid5 compared to and it's
 not just right by the way it's called
 the r8f family which is right two three
 four five and six they use what's known
 as a soar calculation and what that does
 is there's obviously a stripe across the
 disks and you get great capacity out of
 it and that's why it was is why they
 spent the effort to invent it the way
 that that works is the rate controller
 has to do whether it's software hardware
 has to do a lot of work to make that
 stripe work and because of that the rate
 controller becomes a significant point
 of failure compared to rate an array one
 which doesn't have as or calculation so
 the risk that you get beyond performance
 issues the sort calculation causes
 performance issues additionally but
 performance is a arguable point right do
 you care about performance do you not
 care about performance but losing your
 data everyone cares about and I have had
 first hand which will really convince
 you but I also know other companies who
 have had raid controller failure on a
 raid F array node rise fail everything
 lost because the RAID controller freaked
 out and has a destructive operations
 where it destroys all the disks great
 one and rate ten do not have a
 destruction operation to perform on the
 disks well they do a rebuild it is a
 mirror if they were to mirror a good
 disk it would build a new healthy disk
 but if a raid 5 attempts to rebuild an
 unhealthy system it will destroy a
 healthy one
 and so if raid F fails its action in
 failing is to scrap everything and I've
 seen I have definitely seen that
 firsthand caused by chassis shudder in a
 in a data center it was a server that
 was in use for years drives came in and
 out of contact over we're not there's a
 period of minutes or a period of hours
 it kicked off multiple rebuild
 operations and one of them just posed
 tire array so when we found it they had
 all receded themselves and we had six
 healthy discs to help you rate
 controller and no data I think we're at
 it and we're done



