Xen project Mailing List

RE: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom

To: 'Marek Marczykowski-Górecki' <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

From: Paul Durrant <xadimgnik@xxxxxxxxx>

Date: Fri, 5 Jun 2020 18:24:20 +0100

Cc: 'Andrew Cooper' <andrew.cooper3@xxxxxxxxxx>, 'Jan Beulich' <jbeulich@xxxxxxxx>, 'xen-devel' <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 05 Jun 2020 17:24:32 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQKwWBDknGIax7CaUQXMyxrtt3nV1QHivRJwAUZxxQAClbGaTgD0KkheAp3cNn8CbNqNYgH2h+l1AmkAS7ICMiY5xQHIW8HVpnVMnNA=

> -----Original Message----- > From: 'Marek Marczykowski-Górecki' <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > Sent: 05 June 2020 18:14 > To: paul@xxxxxxx > Cc: 'Jan Beulich' <jbeulich@xxxxxxxx>; 'Andrew Cooper' > <andrew.cooper3@xxxxxxxxxx>; 'xen-devel' <xen- > devel@xxxxxxxxxxxxxxxxxxxx> > Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in > stubdom > > On Fri, Jun 05, 2020 at 04:48:39PM +0100, Paul Durrant wrote: > > This (untested) patch might help: > > It is different now. I don't get domain_crash because of > X86EMUL_UNHANDLEABLE anymore, but I still see handle_pio looping for > some time. But it eventually ends, not really sure why. That'll be the shutdown deferral, which I realised later that I forgot about... > > I've tried the patch with a modification to make it build: > > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c > > index c55c4bc4bc..8aa8779ba2 100644 > > --- a/xen/arch/x86/hvm/ioreq.c > > +++ b/xen/arch/x86/hvm/ioreq.c > > @@ -109,12 +109,7 @@ static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, > > uint64_t data) > > ioreq_t *ioreq = &v->arch.hvm.hvm_io.io_req; > > > > if ( hvm_ioreq_needs_completion(ioreq) ) > > - { > > - ioreq->state = STATE_IORESP_READY; > > ioreq->data = data; > > - } > > - else > > - ioreq->state = STATE_IOREQ_NONE; > > > > msix_write_completion(v); > > vcpu_end_shutdown_deferral(v); In fact, move both of these lines... > > @@ -209,6 +204,9 @@ bool handle_hvm_io_completion(struct vcpu *v) > > } > > } > > > > + ioreq->state = hvm_ioreq_needs_completion(&vio->ioreq) ? > vio->io_req->state ... &vio->io_req > > > + STATE_IORESP_READY : STATE_IOREQ_NONE; > > + ... to here too. > > io_completion = vio->io_completion; > > vio->io_completion = HVMIO_no_completion; > > > > The full patch (together with my debug prints): > https://gist.github.com/marmarek/da37da3722179057a6e7add4fb361e06 > > Note some of those X86EMUL_UNHANDLEABLE logged below are about an > intermediate state, not really hvmemul_do_io return value. > > And the log: > (XEN) hvm.c:1620:d6v0 All CPUs offline -- powering off. > (XEN) d3v0 handle_pio port 0xb004 read 0x0000 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d3v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d3v0 handle_pio port 0xb004 read 0x0000 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) d3v0 handle_pio port 0xb004 write 0x0001 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d3v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d3v0 handle_pio port 0xb004 write 0x2001 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d3v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d4v0 XEN_DMOP_remote_shutdown domain 3 reason 0 > (XEN) d4v0 domain 3 domain_shutdown vcpu_id 0 defer_shutdown 1 > (XEN) d4v0 XEN_DMOP_remote_shutdown domain 3 done > (XEN) d4v0 hvm_destroy_ioreq_server called for 3, id 0 > (XEN) hvm.c:1620:d5v0 All CPUs offline -- powering off. > (XEN) d1v0 handle_pio port 0xb004 read 0x0000 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d1v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d1v0 handle_pio port 0xb004 read 0x0000 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) d1v0 handle_pio port 0xb004 write 0x0001 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d1v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d1v0 handle_pio port 0xb004 write 0x2001 is_shutting_down 0 > defer_shutdown 0 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d1v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d2v0 XEN_DMOP_remote_shutdown domain 1 reason 0 > (XEN) d2v0 domain 1 domain_shutdown vcpu_id 0 defer_shutdown 1 > (XEN) d2v0 XEN_DMOP_remote_shutdown domain 1 done > (XEN) grant_table.c:3702:d0v0 Grant release 0x24 ref 0x199 flags 0x2 d5 > (XEN) grant_table.c:3702:d0v0 Grant release 0x25 ref 0x19a flags 0x2 d5 > (XEN) grant_table.c:3702:d0v0 Grant release 0x3 ref 0x11d flags 0x2 d6 > (XEN) grant_table.c:3702:d0v0 Grant release 0x4 ref 0x11e flags 0x2 d6 > (XEN) d3v0 handle_pio port 0xb004 read 0x0000 is_shutting_down 1 > defer_shutdown 1 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d3v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d3v0 handle_pio port 0xb004 write 0xe3f8 is_shutting_down 1 > defer_shutdown 1 paused_for_shutdown > 0 is_shut_down 0 > (XEN) emulate.c:263:d3v0 hvmemul_do_io got X86EMUL_UNHANDLEABLE from > hvm_io_intercept req state 1 > (XEN) d3v0 handle_pio port 0xb000 read 0x0000 is_shutting_down 1 > defer_shutdown 1 paused_for_shutdown > 0 is_shut_down 0 > (XEN) d3v0 handle_pio port 0xb000 read 0x0000 is_shutting_down 1 > defer_shutdown 1 paused_for_shutdown > 0 is_shut_down 0 > > The last one repeats for some time, like 30s or some more (18425 times). > Note the port is different than before. Is it a guest waiting for being > destroyed after requesting so? > I guess it is the destroy being held off by the shutdown deferral? Hopefully the above tweaks should sort that out. Paul > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing?

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.