[Linux-HA] High availability, fast, fileservers
cbyrum at spamaps.org
Thu Mar 18 15:22:41 MST 2004
On Wed, 2004-03-17 at 19:32, Jeff Tucker wrote:
> --On Wednesday, March 17, 2004 5:05 PM -0800 Clint Byrum
> <cbyrum at spamaps.org> wrote:
> Yes, it's maildirs, with Reiserfs. Instead of 6 RAID-1 pairs, it's actually
> two RAID-0 arrays of 6 disks each, then put together as a big RAID-1. In my
> tests with Bonnie++, I tested putting the Reiserfs journal on a separate
> drive. What I found, if I'm interpreting the numbers correctly, is that
> this made very little difference at all. I don't know if Bonnie++ numbers
> accurately measure what happens with my maildirs, though.
Just making sure, but do you have notail set?
Bonnie++ is a flat out benchmark, it doesn't simulate multiple
processes/threads accessing many different files at once. Its going to
write a bunch of files/data, then read it them back. You can use the
synchronized functions of bonnie++ to test concurrent runs, but I'm
guessing you didn't do that (I know I've never done that).
> > Been there done that. Works better with Linux software RAID. It can be
> > fairly dependant on the channels in the JBOD, as you suggest below.
> I agree, especially for RAID-10. With RAID-5, you might benefit from a
> hardware parity engine, but RAID-5 can never be as fast as RAID-10. Of
> course, RAID-10 throws away half your disks, which is a real bummer. Then
> again, so does DRDB.
I look at it more as sacrificing those disks to the gods of seek time
and system load. ;)
Just one nit pick, but I think you know this.. what you have now is not
RAID10, but RAID0+1. The difference is somewhat important, in that if
you ever lose one disk, you have to do a HUGE rebuild of your 6 disk
RAID1. With RAID10, if you lose a disk, you only have to rebuild that
one disk-size RAID1. I've not done enough testing to see if 0+1 gets
any more performance though. I'd imagine they're very similar.
> >> - Fileserver with internal disks
> > I think on a budget, this is the way to go. A big fat case with lots of
> > hotswap SCSI or SATA bays works pretty well. This is what we have for
> > our backup storage server, but its performance is pathetic for random
> > access.
> In my case, I can't tolerate pathetic for random access. I'm looking at one
> case, though, with 14 drives, 7 each on two SCSI busses. Use 2 disks for
> the O/S in RAID-1 and arrange the other 12 in a RAID-0 array. Use two
> machines with the two RAID-0 arrays mirrored using DRDB. I'm just not sure
> how you could beat that.
Sounds great.. sounds scary too (Kind of like Viper at Six Flags, Magic
Mountain in Los Angeles.. ;). You'd have to run DRBD in mode-A, which
doesn't confirm a write until it has happened on both nodes. Then again,
since you're doing two RAID0 writes, and using super low latency
switched gig-E (Fiber is almost definitely required in this case), that
might negate the latency impact enough to get you more performance.
> >> - NetApp
> > *Great* boxes for what you're doing. The price for two is intimidating.
> > If you're willing to accept a single point of failure.. they're fairly
> > redundant internally, and I've never actually seen or even heard of a
> > NetApp dying.
> Yeah, everybody says that. How much does it cost for a NetApp anyway? Let's
> say I need 500 GB of storage. Is that 10 thousand? 20 thousand? More? I
> don't want to skimp and end up doing a bad job, but I'm not convinced
> NetApp is a reasonable solution.
One box, probably $45,000 or so for 500GB with room to grow. The nice
thing is they're fairly upgradeable. Those upgrades aren't cheap either
> Obviously, this lets you spread reads across both boxes and should nearly
> double your read performance. For writes, data still needs to be written to
> both systems eventually, so I'd think writes aren't appreciably helped by
> making each system active for half the data. But, spreading out the reads
> might help a lot. Do others run in this configuration?
The net effect would be that since the disks are spending less time
seeking around for the reads, they should end up writing things faster.
More information about the Linux-HA