On Fri, Jan 15, 2010 at 11:56:46AM +0000, Julian Chesterfield wrote:
> Dave Scott wrote:
> >Hi Pasi,
> >
> >[cc:d Julian who is responsible for storage in XCP]
> >
> >
> >>I haven't looked at the XCP code yet, but are there some special
> >>patches for
> >>LVM to make it work in a shared environment on multiple hosts?
> >>
> >>I guess it's not CLVM, since you support snapshots.. so xapi is doing
> >>some coordination of management commands and making sure only one LVM
> >>command is issued at a time?
> >>
> >
> >Julian could describe the detail better than me but my high-level
> >understanding is:
> >
> >* xapi nominates one host to be the 'SR master': all LVM metadata-changing
> >commands are run here
> >
> >* all hosts are allowed to map/unmap LVs so the LVM commands were patched
> >to make absolutely sure they didn't attempt to change any metadata
> >
> >* unless you request a special "raw" LV, vhd metadata is added to the LV:
> >this is how we handle snapshots
> >
> Yep, this is correct. We use XAPI as the "Cluster lock manager"
> essentially. There is a strict notion of ordering of events, and XAPI
> always ensures that there is a single SRMaster for any shared SR. The SR
> master is the only entity that modifies LVM metadata, and it
> strategically refreshes slaves as necessary. Typically slaves only
> operate in an LVM Read-only mode, so the LVM metadata is refreshed when
> a slave needs to access a new logical volume, and the slave is only
> allowed to create device-mapper nodes, never to modify the LVM metadata.
> There are patches to LVM to add an explicit 'master' flag, this ensures
> that non-masters never attempt to repair LVM metadata if ever it is read
> and found to be inconsistent. In practice this would never happen due to
> the way LVM updates its metadata and the fact that we do not allow
> shared Volume Groups that span more than one LUN, however it's an
> important safety catch.
>
Thank you both for answers! This is good explanation of how it works.
> Snapshot and clone support is provided via the VHD layer that resides
> above raw Logical Volumes. i.e. we create VHD Copy-on-write instances in
> the same way as the file-based VHD support (e.g. NFS or local Ext3
> partitions).
>
Oh, so XenServer/XCP doesn't use LVM snapshots at all? That's good to know.
Is there some commandline tool to control the VHD snapshots?
-- Pasi