[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xend: resume a guest domain after an unsuccessful live migration



Ð ÐÐÐ, 04/02/2013 Ð 15:39 +0000, Ian Campbell ÐÐÑÐÑ:
> > 
> > We use debian sarge, linux-image-3.2.0-3-amd64 and xen-4.1.3 on our
> > servers.
> 
> Do you really mean Sarge? Or did you mean Squeeze or Wheezy? Those
> kernel and Xen versions look like Wheezy versions but perhaps you are
> using backports.

It is my mistake. I want to say debian squeeze with testing kernel and
xen.

> 
> > When a live migration is run the guest domain may not resume on a
> > destination
> > host and is destroyed on a source host.
> > This patch fixes it by resuming the guest domain on a source host when a
> > start of
> > the guest domain was failed.
> 
> xend is supposed to be in maintenance mode so I'm not too sure about
> this sort of change.
> 
> In particular I'm worried that it might break migration from Xen version
> N to version N+1 which is something we try and support.
> 
> BTW the xl toolstack already has this functionality so another option
> for you may be to switch to that.
> 
> > git diff tools/python/xen/xend/XendCheckpoint.py
> > diff --git a/tools/python/xen/xend/XendCheckpoint.py
> > b/tools/python/xen/xend/XendCheckpoint.py
> > index fa09757..6b8765f 100644
> > --- a/tools/python/xen/xend/XendCheckpoint.py
> > +++ b/tools/python/xen/xend/XendCheckpoint.py
> > @@ -163,12 +163,16 @@ def save(fd, dominfo, network, live, dst,
> > checkpoint=False, node=-1,sock=None):
> >              dominfo.resumeDomain()
> >          else:
> >              if live and sock != None:
> 
> This same class of errors isn't possible for non-live?

As I think in non-live migration I have a saved image of VM and can try
to resume it on different servers including the source server. In live
migration if resuming of VM fail I'll stay without running VM and
services althougt VM could continue to run on the source server.

> 
> > +                status = os.read(fd, 64)
> 
> The written strings are 7 or 4 bytes, it would be better to choose a
> fixed length for all writes and the read I think. That might mean
> padding the fail message. Also these protocol strings should be defined
> as constants rather than open coded.
> 
> Even with that addressed I don't really feel confident enough about xend
> internals to Ack a patch like this.
> 

Thank you for your comments and advice to use xl toolstack. We use xen
and xend toolstack and have some scripts with xm and XenAPI. But as I
read xend is deprecated in Xen 4.1 and will be removed in a future
release and a switch to xl may be a good idea.

> >                  try:
> >                      sock.shutdown(2)
> >                  except:
> >                      pass
> >                  sock.close()
> > 
> > +                if status == "FAIL":
> > +                    raise XendError("Restore failed")
> > +
> >              dominfo.destroy()
> >              dominfo.testDeviceComplete()
> >          try:
> > @@ -351,8 +355,14 @@ def restore(xd, fd, dominfo = None, paused = False,
> > relocating = False):
> >          if not paused:
> >              dominfo.unpause()
> > 
> > +        if relocating:
> > +            os.write(fd, "SUCCESS")
> > +
> >          return dominfo
> >      except Exception, exn:
> > +        if relocating:
> > +            os.write(fd, "FAIL")
> > +
> >          dominfo.destroy()
> >          log.exception(exn)
> >          raise exn 
> > 

--
Elena Titova


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.