[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re:[Xen-users] Which distributed file system for xen hosts/guests?
--- John Fairbairn <webmaster@xxxxxxxxxxxx> wrote: > Ive been using eNBD with raid1 over tcp/ip and COW. I think this would be > a viable solution for you since its fast, has raid, is COW, automatically > embeds a code into the client kernels for antispoofing with no extra work, > and is a block device(so clients dont see where it comes from). The High > availability 'virtual' web cluster I've built is IP_VS(ipvsadm), > heartbeat, and eNBD. It's been working well thus far. I guess the nicest > thing about the system I set up is that I used things that are already > available in the 2.6.x kernels with no patching. You may have to recompile > you dom0 and domU for some of the features(enabling NBD, IP_VS, and COW) > but thats still far easier than trying to mix xen patches and other > patches IMO. Here's a link to eNBD if you feel like checking it out > http://www.it.uc3m.es/~ptb/nbd/ and heres IP_VS(ipvsadm) > http://www.linuxvirtualserver.org/software/ipvs.html and heartbeat > http://www.linux-ha.org/HeartbeatProgram > > Hope this Helps ya some. > > John Fairbairn > > > > Maybe this is the wrong place to ask, but since it is related to the > > overall multi-machine architecture, here goes: > > > > What would you folks look at as far as a distributed file systems for > > various physical/virtual Xen machines? Requirements are: > > > > 1. fault tolerant > > 2. relatively speedy > > 3. actively supported/used > > 4. production stability > > 5. ability to add/remove/resize storage online > > 6. clients are unaware of physical file location(s) > > 7. Local client caching for speed over slower network links > > 8. I suppose the file system would sit on something like raid 5 for > > physical protection. > > 9. At a high-level, I'd like to be able to dedicate individual "private" > > file systems to machines as well as have various "public" filesystems, all > > with the same name space. > > 10. Oh yeah -- secure > > > > I've looked at: > > > > 1. GFS -- I want something a bit more > > 2. AFS -- looks great, but it appears to support files up to 2GB? Not big > > enough. Active community though and was/is a commercial product. > > 3. NFS -- please > > 4. Lustre -- looks promising but complex and not completely open source. > > 5. OCFS2 -- Oracle site says beta code, not production ready. Maybe soon? > > 6. Intermezzo -- doesn't look like an active project any more > > 7. Coda -- same I've been watching this thread with interest because I too am in the planning stages of a similar setup. We're going to be creating a cluster, one node here in Jacksonville and one somewhere else in the country for the ultimate in high availability. What John Fairburn is suggesting looks good. I like not having to patch the kernel any more than necessary. But I'm afraid it doesn't address one of the original poster's concerns (forgot his name), number 7: Local client caching for speed over slower network links (which is the concern I'm most interested in). The problem with eNBD and RAID 1 is -- as I understand it -- only one server can read/write at any given time. Ideally I would like something like GFS with global locks so that I could have a cluster with Xen host node A here in Jacksonville and Xen host node B in Los Angeles and both be able to write to /dev/nda1 _at_the_same_time_. This would give better usage of resources and yet still offer a remote hot site. As I understand it, eNBD doesn't offer this. There are a couple of possible solutions to this. Someone might suggest NFS mounting the read/write partition so the other system could access it. That works in theory but it eats up WAN bandwidth; traffic goes into box A which has NFS mounted box B's eNBD partition. Traffic goes back out across the WAN to box B and then is read back across the WAN to box A and then the result is spit back to the client. At least 2x more traffic; not viable. Doesn't address concern number 7 at all. You could use two eNBD partitions, one read/write at each site, and Heartbeat can bring up both read/write on one machine if the other fails. Heartbeat would then restart any services from the failed machine and continue where it left off. I did this with DRBD and Heartbeat. DRBD+Heartbeat is almost identical to eNBD solution above with the advantage that Heartbeat comes with DRBD scripts -- that made setup a breeze! Also, using LVS introduces a single point of failure unless you run it in high-availability mode (see http://www.ultramonkey.org/3/topologies/). Here's how I did it: http://devidal.tv/~chris/DRBD+Heartbeat_overview.html I didn't run Xen on it; I'd consider adding COW if I did. A disadvantage is only one site can serve a particular set of content at any given time. This is somewhat a waste of resources. You can mitigate this a little by serving different content on different partitions; for instance I put web on one partition and mail on another. Then if one system fails the other would handle the load of both for a while. If you do this you introduce another disadvantage: you use 4x as much storage (two pairs of RAID 1 partitions). Something that seems to be a disadvantage about this is it only scales to two systems (RAID 1) but you can set up as many partition pairs as you need. I set up two pairs but could have dozens on one node or have dozens of nodes. DRBD has a nice "group" configuration command (see if eNBD offers this); grouping ensures that if two partitions are on one spindle they don't try to syncronize together (a performance nightmare). I like what GFS offers; multiple clients can read/write to one partition. But the original poster said, "I want something a bit more than what GFS offers." Are you saying this because you like me think that GFS only works with locally-attached shared SCSI storage? Apparently I was wrong: http://gfs.wikidev.net/GNBD_installation Apparently it works with GNBD which would, as I understand it, let two or more nodes in remote locations simultaneously write to the same partition. I'd need it to offer redundancy such as RAID (if not it's a no-go). Need to test this out. If it works well it seems to be the best solution. The original poster's number 5 concern: "ability to add/remove/resize storage online." One of my concerns, too. I was thinking of overlaying whatever network storage I use with LVS. I haven't tried it, but I've been told you can resize partitions on the fly, move data around at will, etc. I'd be hesitant to use it to add another set of RAID drives to an existing RAID drive (like JBOD) because you double your risk of data loss if one set dies (like JBOD). As for security, I'm going to link our sites with a VPN. You could also install a VPN directly on each node if you want traffic to be encrypted before it even hits your LAN. Per my previous question I'll probably run a VPN daemon in bridging mode on domain0. Finally, I'm watching this project very carefully: http://sourceware.org/cluster/ddraid/ Looks promising, too. Anyone else have input? CD Ever lied? You're a liar. Ever stolen? You're a thief. Ever hated? The bible equates hate with murder. Ever lusted? Jesus equated lust with adultery. You've broken God's law. He'll judge all evil and you're without hope -- unless you have a savior. Repent and believe. _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |