Xen project Mailing List

Re: [Xen-devel] [BUG] Emulation issues

To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>, Sander Eikelenboom <linux@xxxxxxxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Fri, 31 Jul 2015 17:15:31 +0200

Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 31 Jul 2015 15:15:58 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

El 31/07/15 a les 16.19, Paul Durrant ha escrit: >> -----Original Message----- >> From: Paul Durrant >> Sent: 31 July 2015 13:21 >> To: Paul Durrant; Roger Pau Monne; Sander Eikelenboom >> Cc: Andrew Cooper; xen-devel >> Subject: RE: [Xen-devel] [BUG] Emulation issues >> >>> -----Original Message----- >>> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel- >>> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant >>> Sent: 31 July 2015 12:43 >>> To: Roger Pau Monne; Sander Eikelenboom >>> Cc: Andrew Cooper; xen-devel >>> Subject: Re: [Xen-devel] [BUG] Emulation issues >>> >>>> -----Original Message----- >>>> From: Roger Pau Monné [mailto:roger.pau@xxxxxxxxxx] >>>> Sent: 31 July 2015 12:42 >>>> To: Paul Durrant; Sander Eikelenboom >>>> Cc: Andrew Cooper; xen-devel >>>> Subject: Re: [Xen-devel] [BUG] Emulation issues >>>> >>>> El 31/07/15 a les 13.39, Paul Durrant ha escrit: >>>>>> -----Original Message----- >>>>>> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx] >>>>>> Sent: 31 July 2015 12:12 >>>>>> To: Paul Durrant >>>>>> Cc: Andrew Cooper; Roger Pau Monne; xen-devel >>>>>> Subject: Re: [Xen-devel] [BUG] Emulation issues >>>>>> >>>>>> >>>>>> Friday, July 31, 2015, 12:22:16 PM, you wrote: >>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel- >>>>>>>> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant >>>>>>>> Sent: 30 July 2015 14:20 >>>>>>>> To: Andrew Cooper; Roger Pau Monne; xen-devel >>>>>>>> Subject: Re: [Xen-devel] [BUG] Emulation issues >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx] >>>>>>>>> Sent: 30 July 2015 14:19 >>>>>>>>> To: Paul Durrant; Roger Pau Monne; xen-devel >>>>>>>>> Subject: Re: [BUG] Emulation issues >>>>>>>>> >>>>>>>>> On 30/07/15 14:12, Paul Durrant wrote: >>>>>>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1. >>>>>>>>>>> (XEN) domain_crash called from io.c:166 >>>>>>>>>>> (XEN) d19v0 weird emulation state 1 >>>>>>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1. >>>>>>>>>>> (XEN) domain_crash called from io.c:166 >>>>>>>>>>> (XEN) d19v0 weird emulation state 1 >>>>>>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1. >>>>>>>>>>> (XEN) domain_crash called from io.c:166 >>>>>>>>>>> >>>>>>>>>> Hmm. Can't understand how that's happening... handle_pio() >>>>>> shouldn't >>>>>>>> be >>>>>>>>> called unless the state is STATE_IORESP_READY and yet the inner >>>>>> function >>>>>>>> is >>>>>>>>> hitting the default case in the switch. >>>>>>>>> >>>>>>>>> Sounds like something is changing the state between the two >>> checks. >>>> Is >>>>>>>>> this shared memory writeable by qemu? >>>>>>>>> >>>>>>>> >>>>>>>> No, this is the internal state. I really can't see how it's being >> changed. >>>>>>>> >>>>>> >>>>>>> I've tried to replicate your test on my rig (which is an old AMD box >> but >>>> quite >>>>>> a big one). Even so I only seem to get about half the VMs to start. The >>>>>> shutdown works fine, and I don't see any problems on the Xen >> console. >>>> I'm >>>>>> using an older build of Xen but still one with my series in. I'll try >>>>>> pulling >>> up >>>> to >>>>>> the same commit as you and try again. >>>>>> >>>>>>> Paul >>>>>> >>>>>> Hi Paul, >>>>>> >>>>>> From what i recall it started around when Tiejun Chen's series went in. >>>>>> >>>> >>>> Since I can reproduce this at will I will attempt to perform a >>>> bisection. Maybe this can help narrow down the issue. >>>> >>> >>> Thanks. That would be very helpful. I will continue to try to repro. >>> >> >> Still no luck with the repro but I think I might my thought experiments might >> have got it... >> >> If a vcpu has a request in-flight then its internal ioreq state will be >> IOREQ_READY and it will be waiting for wake-up. When it is woken up >> hvm_do_resume() will be called and it will call hvm_wait_for_io(). If the >> shared (with QEMU) ioreq state is still IOREQ_READY or IOREQ_INPROCESS >> then the vcpu will block again. If the shared state is IORESP_READY then the >> emulation is done and the internal state will be updated to IORESP_READY or >> IOREQ_NONE by hvm_io_assist() depending upon whether any completion >> is needed or not. >> *However* if the emulator (or Xen) happens to zero out the shared ioreq >> state before hvm_wait_for_io() is called then it will see a shared state of >> IOREQ_NONE so it will terminate without calling hvm_io_assist() leaving the >> internal ioreq state as IOREQ_READY which will then cause the >> domain_crash() you're seeing when re-emulation is attempted by a >> completion handler. >> >> So, there is an underlying problem in that a dying emulator can leave an I/O >> uncompleted but the code in Xen needs to cope more gracefully with that >> (since the vcpu will be going away anyway) and not call domain_crash(). >> > > Can you please try this patch: > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index ec1d797..197a8c4 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -412,44 +412,52 @@ bool_t hvm_io_pending(struct vcpu *v) > &d->arch.hvm_domain.ioreq_server.list, > list_entry ) > { > - ioreq_t *p = get_ioreq(s, v); > + struct hvm_ioreq_vcpu *sv; > > - if ( p->state != STATE_IOREQ_NONE ) > - return 1; > + list_for_each_entry ( sv, > + &s->ioreq_vcpu_list, > + list_entry ) > + { > + if ( sv->vcpu == v && sv->pending ) > + return 1; > + } > } > > return 0; > } > > -static void hvm_io_assist(ioreq_t *p) > +static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, uint64_t data) > { > - struct vcpu *curr = current; > - struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io; > - > - p->state = STATE_IOREQ_NONE; > + struct vcpu *v = sv->vcpu; > + struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io; > > if ( hvm_vcpu_io_need_completion(vio) ) > { > vio->io_req.state = STATE_IORESP_READY; > - vio->io_req.data = p->data; > + vio->io_req.data = data; > } > else > vio->io_req.state = STATE_IOREQ_NONE; > > - msix_write_completion(curr); > - vcpu_end_shutdown_deferral(curr); > + msix_write_completion(v); > + vcpu_end_shutdown_deferral(v); > + > + sv->pending = 0; > } > > static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p) > { > - /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */ > - while ( p->state != STATE_IOREQ_NONE ) > + while ( sv->pending ) > { > switch ( p->state ) > { > + case STATE_IOREQ_NONE: > + hvm_io_assist(sv, ~0ul); > + break; > case STATE_IORESP_READY: /* IORESP_READY -> NONE */ > rmb(); /* see IORESP_READY /then/ read contents of ioreq */ > - hvm_io_assist(p); > + p->state = STATE_IOREQ_NONE; > + hvm_io_assist(sv, p->data); > break; > case STATE_IOREQ_READY: /* IOREQ_{READY,INPROCESS} -> IORESP_READY > */ > case STATE_IOREQ_INPROCESS: > @@ -459,6 +467,7 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, > ior > break; > default: > gdprintk(XENLOG_ERR, "Weird HVM iorequest state %d.\n", > p->state); > + sv->pending = 0; > domain_crash(sv->vcpu->domain); > return 0; /* bail */ > } > @@ -489,7 +498,7 @@ void hvm_do_resume(struct vcpu *v) > &s->ioreq_vcpu_list, > list_entry ) > { > - if ( sv->vcpu == v ) > + if ( sv->vcpu == v && sv->pending ) > { > if ( !hvm_wait_for_io(sv, get_ioreq(s, v)) ) > return; > @@ -2743,6 +2752,8 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t > *pr > */ > p->state = STATE_IOREQ_READY; > notify_via_xen_event_channel(d, port); > + > + sv->pending = 1; > return X86EMUL_RETRY; > } > } > diff --git a/xen/include/asm-x86/hvm/domain.h > b/xen/include/asm-x86/hvm/domain.h > index b612755..12b5e0c 100644 > --- a/xen/include/asm-x86/hvm/domain.h > +++ b/xen/include/asm-x86/hvm/domain.h > @@ -46,6 +46,7 @@ struct hvm_ioreq_vcpu { > struct list_head list_entry; > struct vcpu *vcpu; > evtchn_port_t ioreq_evtchn; > + bool_t pending; > }; > > #define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1) > -- Thanks, this solves the issue for me, I've been able to shutdown +40 HVM guests without issues. You can add my Tested-by when you formally post the patch. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.