[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 7/7] libxl: Wait for QEMU startup in stubdomain

On Fri, 6 Feb 2015, Wei Liu wrote:
> > > Unfortunately this problem can't be solved without putting in
> > > significant effort and time (involves redesign of protocol and handle
> > > all the compatibility issues). We can't say for sure when the solution
> > > is going to land.
> > 
> > I noticed some discussion about this on xen-devel.  Unfortunately, I
> > was unable to find anything that laid out specifically what the
> > problems are - can you point me to a bug report or such?  The libxl
> > startup code - with callbacks on top of callbacks, callbacks within
> > callbacks, and callbacks stashed away in little places only to be
> > called _much_ later - is really convoluted, I suspect particularly so
> > for stubdom startup.  I am not surprised it got broken - who can
> > remember how it works?
> > 
> It's not how libxl is coded. It's the startup protocol that is broken.
> The breakage of stubdom in Xen 4.5 is a latent bug exposed by a new
> feature. 
> I guess I should just send a bug report saying "Device model startup
> protocol is broken". But I don't have much to say at this point, because
> thorough research for both qemu-trad and qemu-upstream is required to
> produce a sensible report.
> > While working on these patches reviving Anthony's work, I consistently
> > ran into HVM starup problems with QEMU upstream in a stub domain (it
> > always failed).  What I could not figure out is why QEMU-traditional
> > did not have a similar problem; it seemed to me that the same race
> > existed for QEMU-traditional stubdom.  I wrote it off as either (1)
> > MiniOS startup was so much faster than Linux that QEMU-traditional
> > always won the race, or (2) there was some implicit mechanism in
> My bet is on 1).
> > QEMU-traditional that ensured the HVM guest would wait for the device
> > model to be in place.  It sounds like maybe the race ctually is being
> QEMU-trad stubdom is suffering from the same problem.
> > lost in 4.5.
> > 
> So prior to 4.5, when there is emulation request issued by a guest vcpu,
> that request is put on a ring, guest vcpu is paused. When a DM shows up
> it processes that request, posts response, then guest vcpu is unpaused.
> So there is implicit dependency on Xen's behaviour for DM to work.
> In 4.5, a new feature called ioreq server is added. When Xen sees an
> io request which no backing DM, it returns immediately. Guest sees some
> wired value and crashes. That is, Xen's behaviour has changed and a
> latent bug in stubdom's startup protocol is exposed.

I don't think we can stall the development of a new feature like this
based on what can be seen as a regression.

If the bug cannot be fixed in a timely fashion we should revert to
previous behaviour as soon as possible.

> > If the problem you are contending with is that the HVM guest is being
> > unpaused before the device model is in place, I suggest that this
> > patch, or someting much like it, should address it.  I note that I
> > merely verified it did not break QEMU-traditional stubdom, but it is
> > just a matter of ensuring QEMU-traditional writes to _some_ xenstore
> > path when it is ready (it might do this already, in fact), and that
> > this patch waits on that path.  Also, it should be pretty easy to
> > extend this concept to ensure any additional stubdoms, such as vTPM,
> > are up and running before leaving the code im libxl_dm.c and unpausing
> > the HVM domain - we just chain through additional callbacks as needed.
> > 
> Yes, that's the basic idea, chaining things together.
> > There may be a desire to do a major rework of libxl_dm.c, etc., but
> > this patch might be a reasonable bandaid now for Xen 4.5.1.
> > 
> > > Also upstream QEMU stubdom, as you already notice, doesn't have a
> > > critical functionality -- save / restore. Adding that in might involve
> > > upstreaming some changes to QEMU, which has a time frame that is out of
> > > our control.
> > 
> > Xen maintains a separate repo for the QEMU code it uses.  I presume
> > this is because there is always something a little out of sync with
> > the mainstream QEMU release.  I do not understand why we cannot rely
> > on this to make available any needed changes to QEMU pending their
> > incorporation into QEMU proper.
> ISTR our policy is upstream first. That is, though we maintain our own
> qemu tree those changesets are all upstream changesets. Arguably there
> might be some bandaid changesets that are not upstream but big changes
> like this needs to be upstreamed first.
> Stefano, could you clarify this and correct me if I'm wrong?

Yes, the policy is upstream first, however it doesn't need to land in a
QEMU release. Just be upstream.

There is still please of time for that: Eric just needs to send his
patches to qemu-devel, get the acks, and I'll apply to

> > > So my hunch is that we're not going to make it in time for
> > > 4.6. :-/
> > >
> > > Wei.
> > 
> > 4.5 was _just_ released, and Xen is on a ~10 month release cycle.  Why
> > can't this get done?  Someone just has to take a little time to sit
> Notably there are many months that are code freeze. 
> And due to our upstream first QEMU policy we would also need to upstream
> changes to QEMU.

Getting the patches upstream in QEMU shouldn't take longer than getting
them upstream in Xen.

Also I think upstream QEMU stubdoms would be valuable even without
save/restore support.

> > Can we arrive at an agreement that a Linux-based QEMU-upstream stubdom
> > should _at least_ be a technical preview for Xen 4.6?  A year ago,

I agree. Or rumpkernel-based QEMU-upstream stubdom. Or something.

> If we really want to make this happen before new protocol and
> implementation are in place.  That would be "tech preview" or
> "experimental", whichever is the term for least mature technology. Note
> that this is not due to the route it chooses (Linux based), it's due to
> the fact that the protocol is broken and destined to be changed.
I think we should not block the entire upstream stubdom effort, whether
it is Linux, MiniOS or Rumpkernel based, waiting for the bootup protocol
to be fixed.

The two things can and should be done in parallel.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.