[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 13394: regressions - FAIL
On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote: > On Fri, 29 Jun 2012, Ian Campbell wrote: > > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote: > > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"): > > > > Tests which did not succeed and are blocking, > > > > including tests which could not be run: > > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. > > > > vs. 13379 > > > > > > The logs show this: > > > > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: > > > wait for device model timed out > > > > > > And in xenstore: > > > > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) > > > > > > And in the source code: > > > > > > $ grep -R logdirty qemu-upstream-unstable.git/* > > > $ > > > > > > So the upstream qemu does not participate properly in the migration > > > protocol. And anyway this protocol seems to involve xenstore and I > > > would have expected it to do something with QMP. But there is no code > > > in libxl to do this (and never has been) and no code in upstream qemu > > > to do it either. > > > > > > That means we'll get memory corruption in migrated guests with the new > > > qemu: any time qemu writes to guest memory, it needs to trigger a > > > logdirty update so that the write is properly transferred to the > > > migration target domain. > > > > > > With the old libxl we didn't notice this apart from random failures. > > > With my new migration code, particularly > > > 25542:1883e5c71a87 > > > libxl: wait for qemu to acknowledge logdirty command > > > this turns into a hard failure. > > > > > > I will add this as an allowable test failure pending a proper fix. > > > > Thanks for investigating. It does appear that this has always been > > broken. > > > > Do we think this is a blocker for 4.2? > > I wouldn't consider it a blocker, given that upstream QEMU is not the > default for HVM guests. > > > > It certainly prevents us from suggesting that we support HVM migration > > with the (non-default) upstream qemu. > > > > If we can't fix this for 4.2 (e.g. because we need to get patches into > > upstream qemu or because the libxl side is too involved) we should at a > > minimum make libxl reject attempts to migrate such domains with an > > appropriate error message. > > We do need to get patches in QEMU to fix this but we could backport them in > qemu-upstream-unstable (and ask for backports to the stable trees). Can we do that in time for 4.2? It's pretty late in the day. I think we need to consider either achieving this or adding the appropriate error message as a blocker. Hopefully the former but falling back to the latter if it comes to it. > > How does this impact the use of upstream qemu for PV guest backends vs > > migration? I *think* they don't require log-dirty support, but I'm not > > sure. > > It does not affect qemu for PV guests. Great. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |