[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Making snapshot of logical volumes handling HVM domU causes OOPS and instability
On Sunday 12 September 2010 20:48:09 Scott Garron wrote: > On 9/12/2010 5:41 AM, J. Roeleveld wrote: > > I also use LVMs extensively and do similar steps for backups. > > 1) umount in domU > > 2) block-detach > > 3) lvcreate snapshot > > 4) block-attach > > 5) mount in domU > > I think the biggest difference, here, is that you unmount and > detach the source volumes before creating the snapshot whereas I just > leave them active and mounted in the guest. I don't know if that will > end up being the difference between stability and instability on my > system, but it's an observation and probably worth experimentation. I tend to umount first to ensure the filesystem is consistent and no writes are still left in the write-buffer on the guest. Filesystem recoveries are fine, but why rely on them when it's not necessary? :) > > I, however, have no need for HVM and only use PV guests. > > It turns out that it doesn't seem isolated to HVM guests on my > system any longer. That was just coincidental during the first few > crashes that I observed. Ok, I believe the issue might be related to the LVM-stack and the way Xen holds the devices locked when they are actually mounted and attached? > > Are you certain the snapshots are large enough to hold all possible > > changes that might occur on the LV during the existence of the > > snapshot? > > Certainly. The most recent one to cause a crash has existed > through the crash and for 3 days now, and is only using 2.65% of its COW > space. They usually don't get a chance to go above even 0.3% before the > rsync on them is finished and they are unmounted and removed by the > backup script. Ok, guess that's not the cause :) Although, I get the "unable to remove active" error when there is 0% used, but also over 20% used, so there is no clear indication what is causing it (to me) > > Another thing I notice, which might be of help to people who > > understand this better then I do, in my backup-script, sometimes step > > "5" fails because the domU hasn't noticed the device is attached > > again when I try to mount it. The domU-commands are run using > > SSH-connections. > > That probably just has to do with variations in how long it takes > the guest kernel to poll or be notified of device changes, and how long > it takes for its udev to create the device files and whatnot. > Introducing some sanity checks or just a longer delay in your backup > script would likely get around that problem. (I could be wrong, though) I do need to add some sanity checks into the script at some point, but currently I start these manually and 'fix' the left-overs myself. The mount-issue is a simple one and I notice this within 30-40 seconds of the scripts starting. -- Joost _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |