[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xend: resume a guest domain after an unsuccessful live migration
On Mon, 2013-02-04 at 07:24 +0000, Elena V. Titova wrote: > Hello. > > We use debian sarge, linux-image-3.2.0-3-amd64 and xen-4.1.3 on our > servers. Do you really mean Sarge? Or did you mean Squeeze or Wheezy? Those kernel and Xen versions look like Wheezy versions but perhaps you are using backports. > When a live migration is run the guest domain may not resume on a > destination > host and is destroyed on a source host. > This patch fixes it by resuming the guest domain on a source host when a > start of > the guest domain was failed. xend is supposed to be in maintenance mode so I'm not too sure about this sort of change. In particular I'm worried that it might break migration from Xen version N to version N+1 which is something we try and support. BTW the xl toolstack already has this functionality so another option for you may be to switch to that. > git diff tools/python/xen/xend/XendCheckpoint.py > diff --git a/tools/python/xen/xend/XendCheckpoint.py > b/tools/python/xen/xend/XendCheckpoint.py > index fa09757..6b8765f 100644 > --- a/tools/python/xen/xend/XendCheckpoint.py > +++ b/tools/python/xen/xend/XendCheckpoint.py > @@ -163,12 +163,16 @@ def save(fd, dominfo, network, live, dst, > checkpoint=False, node=-1,sock=None): > dominfo.resumeDomain() > else: > if live and sock != None: This same class of errors isn't possible for non-live? > + status = os.read(fd, 64) The written strings are 7 or 4 bytes, it would be better to choose a fixed length for all writes and the read I think. That might mean padding the fail message. Also these protocol strings should be defined as constants rather than open coded. Even with that addressed I don't really feel confident enough about xend internals to Ack a patch like this. > try: > sock.shutdown(2) > except: > pass > sock.close() > > + if status == "FAIL": > + raise XendError("Restore failed") > + > dominfo.destroy() > dominfo.testDeviceComplete() > try: > @@ -351,8 +355,14 @@ def restore(xd, fd, dominfo = None, paused = False, > relocating = False): > if not paused: > dominfo.unpause() > > + if relocating: > + os.write(fd, "SUCCESS") > + > return dominfo > except Exception, exn: > + if relocating: > + os.write(fd, "FAIL") > + > dominfo.destroy() > log.exception(exn) > raise exn > > -- > Elena Titova > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |