[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
On Tue, Jun 13, 2017 at 10:30 AM, Julien Grall <julien.grall@xxxxxxx> wrote: > Hi Jan, > > > On 12/06/2017 15:57, Jan Beulich wrote: >>>>> >>>>> On 12.06.17 at 16:30, <julien.grall@xxxxxxx> wrote: >>> >>> On 09/06/17 09:19, Jan Beulich wrote: >>>>>>> >>>>>>> On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote: >>>>>>>> >>>>>>>> On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote: >>>>>> >>>>>> On Tue, 6 Jun 2017, Jan Beulich wrote: >>>>>>>>>> >>>>>>>>>> On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> Looking at the serial logs for that and comparing them with 10009, >>>>>>>> it's not terribly easy to see what's going on because the kernel >>>>>>>> versions are different and so produce different messages about >>>>>>>> xenbr0 >>>>>>>> (and I think may have a different bridge port management algorithm). >>>>>>>> >>>>>>>> But the messages about promiscuous mode seem the same, and of course >>>>>>>> promiscuous mode is controlled by userspace, rather than by the >>>>>>>> kernel >>>>>>>> (so should be the same in both). >>>>>>>> >>>>>>>> However, in the failed test we see extra messages about promis: >>>>>>>> >>>>>>>> Jun 5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left >>>>>>>> promiscuous >>>>>>>> mode >>>>>>>> ... >>>>>>>> Jun 5 13:37:08.377571 [ 2191.675298] device vif7.0 left >>>>>>>> promiscuous mode >>>>>>> >>>>>>> >>>>>>> Wouldn't those be another result of the guest shutting down / >>>>>>> being shut down? >>>>>>> >>>>>>>> Also, the qemu log for the guest in the failure case says this: >>>>>>>> >>>>>>>> Log-dirty command enable >>>>>>>> Log-dirty: no command yet. >>>>>>>> reset requested in cpu_handle_ioreq. >>>>>>> >>>>>>> >>>>>>> So this would seem to call for instrumentation on the qemu side >>>>>>> then, as the only path via which this can be initiated is - afaics - >>>>>>> qemu_system_reset_request(), which doesn't have very many >>>>>>> callers that could possibly be of interest here. Adding Stefano ... >>>>>> >>>>>> >>>>>> I am pretty sure that those messages come from qemu traditional: >>>>>> "reset >>>>>> requested in cpu_handle_ioreq" is not printed by qemu-xen. >>>>> >>>>> >>>>> Oh, indeed - I didn't pay attention to this being a *-qemut-* >>>>> test. I'm sorry. >>>>> >>>>>> In any case, the request comes from qemu_system_reset_request, which >>>>>> is >>>>>> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS >>>>>> initiated the reset (or resume)? >>>>> >>>>> >>>>> Right, this and hw/pckbd.c look to be the only possible >>>>> sources. Yet then it's still unclear what makes the guest go >>>>> down. >>>> >>>> >>>> So with all of the above in mind I wonder whether we shouldn't >>>> revert 933f966bcd then - that debugging code is unlikely to help >>>> with any further analysis of the issue, as reaching that code >>>> for a dying domain is only a symptom as far as we understand it >>>> now, not anywhere near the cause. >>> >>> >>> Are you suggesting to revert on Xen 4.9? >> >> >> Yes, if we revert now, then I'd say on both master and 4.9. > > > I would be ok with that. Reverting 933f966bcd Acked-by: George Dunlap <george.dunlap@xxxxxxxxxx> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |