[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL



>>> On 12.06.17 at 16:30, <julien.grall@xxxxxxx> wrote:
> On 09/06/17 09:19, Jan Beulich wrote:
>>>>> On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote:
>>>>>> On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote:
>>>> On Tue, 6 Jun 2017, Jan Beulich wrote:
>>>>>>>> On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote:
>>>>>> Looking at the serial logs for that and comparing them with 10009,
>>>>>> it's not terribly easy to see what's going on because the kernel
>>>>>> versions are different and so produce different messages about xenbr0
>>>>>> (and I think may have a different bridge port management algorithm).
>>>>>>
>>>>>> But the messages about promiscuous mode seem the same, and of course
>>>>>> promiscuous mode is controlled by userspace, rather than by the kernel
>>>>>> (so should be the same in both).
>>>>>>
>>>>>> However, in the failed test we see extra messages about promis:
>>>>>>
>>>>>>   Jun  5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left 
>>>>>> promiscuous
>>>>>> mode
>>>>>>   ...
>>>>>>   Jun  5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous 
>>>>>> mode
>>>>>
>>>>> Wouldn't those be another result of the guest shutting down /
>>>>> being shut down?
>>>>>
>>>>>> Also, the qemu log for the guest in the failure case says this:
>>>>>>
>>>>>>   Log-dirty command enable
>>>>>>   Log-dirty: no command yet.
>>>>>>   reset requested in cpu_handle_ioreq.
>>>>>
>>>>> So this would seem to call for instrumentation on the qemu side
>>>>> then, as the only path via which this can be initiated is - afaics -
>>>>> qemu_system_reset_request(), which doesn't have very many
>>>>> callers that could possibly be of interest here. Adding Stefano ...
>>>>
>>>> I am pretty sure that those messages come from qemu traditional: "reset
>>>> requested in cpu_handle_ioreq" is not printed by qemu-xen.
>>>
>>> Oh, indeed - I didn't pay attention to this being a *-qemut-*
>>> test. I'm sorry.
>>>
>>>> In any case, the request comes from qemu_system_reset_request, which is
>>>> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS
>>>> initiated the reset (or resume)?
>>>
>>> Right, this and hw/pckbd.c look to be the only possible
>>> sources. Yet then it's still unclear what makes the guest go
>>> down.
>>
>> So with all of the above in mind I wonder whether we shouldn't
>> revert 933f966bcd then - that debugging code is unlikely to help
>> with any further analysis of the issue, as reaching that code
>> for a dying domain is only a symptom as far as we understand it
>> now, not anywhere near the cause.
> 
> Are you suggesting to revert on Xen 4.9?

Yes, if we revert now, then I'd say on both master and 4.9.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.