[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Julien Grall <julien.grall@xxxxxxx>
Date: Tue, 13 Jun 2017 10:30:09 +0100
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, osstest-admin@xxxxxxxxxxxxxx, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, nd@xxxxxxx
Delivery-date: Tue, 13 Jun 2017 09:30:25 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Nodisclaimer: True
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

Hi Jan,

On 12/06/2017 15:57, Jan Beulich wrote:

On 12.06.17 at 16:30, <julien.grall@xxxxxxx> wrote:

On 09/06/17 09:19, Jan Beulich wrote:

On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote:

On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote:

On Tue, 6 Jun 2017, Jan Beulich wrote:

On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote:

Looking at the serial logs for that and comparing them with 10009,
it's not terribly easy to see what's going on because the kernel
versions are different and so produce different messages about xenbr0
(and I think may have a different bridge port management algorithm).

But the messages about promiscuous mode seem the same, and of course
promiscuous mode is controlled by userspace, rather than by the kernel
(so should be the same in both).

However, in the failed test we see extra messages about promis:

  Jun  5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left promiscuous
mode
  ...
  Jun  5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous mode


Wouldn't those be another result of the guest shutting down /
being shut down?

Also, the qemu log for the guest in the failure case says this:

  Log-dirty command enable
  Log-dirty: no command yet.
  reset requested in cpu_handle_ioreq.


So this would seem to call for instrumentation on the qemu side
then, as the only path via which this can be initiated is - afaics -
qemu_system_reset_request(), which doesn't have very many
callers that could possibly be of interest here. Adding Stefano ...


I am pretty sure that those messages come from qemu traditional: "reset
requested in cpu_handle_ioreq" is not printed by qemu-xen.


Oh, indeed - I didn't pay attention to this being a *-qemut-*
test. I'm sorry.

In any case, the request comes from qemu_system_reset_request, which is
called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS
initiated the reset (or resume)?


Right, this and hw/pckbd.c look to be the only possible
sources. Yet then it's still unclear what makes the guest go
down.


So with all of the above in mind I wonder whether we shouldn't
revert 933f966bcd then - that debugging code is unlikely to help
with any further analysis of the issue, as reaching that code
for a dying domain is only a symptom as far as we understand it
now, not anywhere near the cause.


Are you suggesting to revert on Xen 4.9?


Yes, if we revert now, then I'd say on both master and 4.9.


I would be ok with that.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: George Dunlap

References:
- [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: osstest service owner
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Ian Jackson
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Stefano Stabellini
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Julien Grall
- Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH] x86/mm: Split read_cr3() into read_cr3_pa() and __read_cr3()
Next by Date: Re: [Xen-devel] read_atomic, write_atomic, add_sized
Previous by thread: Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
Next by thread: Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.