[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL



>>> On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote:
>>>> On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote:
>> On Tue, 6 Jun 2017, Jan Beulich wrote:
>>> >>> On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote:
>>> > Looking at the serial logs for that and comparing them with 10009,
>>> > it's not terribly easy to see what's going on because the kernel
>>> > versions are different and so produce different messages about xenbr0
>>> > (and I think may have a different bridge port management algorithm).
>>> > 
>>> > But the messages about promiscuous mode seem the same, and of course
>>> > promiscuous mode is controlled by userspace, rather than by the kernel
>>> > (so should be the same in both).
>>> > 
>>> > However, in the failed test we see extra messages about promis:
>>> > 
>>> >   Jun  5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left 
>>> > promiscuous 
>>> > mode
>>> >   ...
>>> >   Jun  5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous 
>>> > mode
>>> 
>>> Wouldn't those be another result of the guest shutting down /
>>> being shut down?
>>> 
>>> > Also, the qemu log for the guest in the failure case says this:
>>> > 
>>> >   Log-dirty command enable
>>> >   Log-dirty: no command yet.
>>> >   reset requested in cpu_handle_ioreq.
>>> 
>>> So this would seem to call for instrumentation on the qemu side
>>> then, as the only path via which this can be initiated is - afaics -
>>> qemu_system_reset_request(), which doesn't have very many
>>> callers that could possibly be of interest here. Adding Stefano ...
>> 
>> I am pretty sure that those messages come from qemu traditional: "reset
>> requested in cpu_handle_ioreq" is not printed by qemu-xen.
> 
> Oh, indeed - I didn't pay attention to this being a *-qemut-*
> test. I'm sorry.
> 
>> In any case, the request comes from qemu_system_reset_request, which is
>> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS
>> initiated the reset (or resume)?
> 
> Right, this and hw/pckbd.c look to be the only possible
> sources. Yet then it's still unclear what makes the guest go
> down.

So with all of the above in mind I wonder whether we shouldn't
revert 933f966bcd then - that debugging code is unlikely to help
with any further analysis of the issue, as reaching that code
for a dying domain is only a symptom as far as we understand it
now, not anywhere near the cause.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.