[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Emulation issues


  • To: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
  • From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
  • Date: Fri, 31 Jul 2015 14:19:51 +0000
  • Accept-language: en-GB, en-US
  • Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 31 Jul 2015 14:20:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHQyee9+pAJgVEo1EWTXUn+dTE1Kp3yPJYA///i8QCAAEREAP//8vQAgAF1SaD//+BQgAAES/FAAAEf9JAAAH3vAAAETAAA///hPoD//95N8P/+XYGQgAMy/gD//9a/AAAGNNKA///eVuD//7VHsP//RlMw
  • Thread-topic: [Xen-devel] [BUG] Emulation issues

> -----Original Message-----
> From: Paul Durrant
> Sent: 31 July 2015 13:21
> To: Paul Durrant; Roger Pau Monne; Sander Eikelenboom
> Cc: Andrew Cooper; xen-devel
> Subject: RE: [Xen-devel] [BUG] Emulation issues
> 
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> > bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
> > Sent: 31 July 2015 12:43
> > To: Roger Pau Monne; Sander Eikelenboom
> > Cc: Andrew Cooper; xen-devel
> > Subject: Re: [Xen-devel] [BUG] Emulation issues
> >
> > > -----Original Message-----
> > > From: Roger Pau Monné [mailto:roger.pau@xxxxxxxxxx]
> > > Sent: 31 July 2015 12:42
> > > To: Paul Durrant; Sander Eikelenboom
> > > Cc: Andrew Cooper; xen-devel
> > > Subject: Re: [Xen-devel] [BUG] Emulation issues
> > >
> > > El 31/07/15 a les 13.39, Paul Durrant ha escrit:
> > > >> -----Original Message-----
> > > >> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx]
> > > >> Sent: 31 July 2015 12:12
> > > >> To: Paul Durrant
> > > >> Cc: Andrew Cooper; Roger Pau Monne; xen-devel
> > > >> Subject: Re: [Xen-devel] [BUG] Emulation issues
> > > >>
> > > >>
> > > >> Friday, July 31, 2015, 12:22:16 PM, you wrote:
> > > >>
> > > >>>> -----Original Message-----
> > > >>>> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> > > >>>> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
> > > >>>> Sent: 30 July 2015 14:20
> > > >>>> To: Andrew Cooper; Roger Pau Monne; xen-devel
> > > >>>> Subject: Re: [Xen-devel] [BUG] Emulation issues
> > > >>>>
> > > >>>>> -----Original Message-----
> > > >>>>> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
> > > >>>>> Sent: 30 July 2015 14:19
> > > >>>>> To: Paul Durrant; Roger Pau Monne; xen-devel
> > > >>>>> Subject: Re: [BUG] Emulation issues
> > > >>>>>
> > > >>>>> On 30/07/15 14:12, Paul Durrant wrote:
> > > >>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
> > > >>>>>>> (XEN) domain_crash called from io.c:166
> > > >>>>>>> (XEN) d19v0 weird emulation state 1
> > > >>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
> > > >>>>>>> (XEN) domain_crash called from io.c:166
> > > >>>>>>> (XEN) d19v0 weird emulation state 1
> > > >>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
> > > >>>>>>> (XEN) domain_crash called from io.c:166
> > > >>>>>>>
> > > >>>>>> Hmm. Can't understand how that's happening... handle_pio()
> > > >> shouldn't
> > > >>>> be
> > > >>>>> called unless the state is STATE_IORESP_READY and yet the inner
> > > >> function
> > > >>>> is
> > > >>>>> hitting the default case in the switch.
> > > >>>>>
> > > >>>>> Sounds like something is changing the state between the two
> > checks.
> > > Is
> > > >>>>> this shared memory writeable by qemu?
> > > >>>>>
> > > >>>>
> > > >>>> No, this is the internal state. I really can't see how it's being
> changed.
> > > >>>>
> > > >>
> > > >>> I've tried to replicate your test on my rig (which is an old AMD box
> but
> > > quite
> > > >> a big one). Even so I only seem to get about half the VMs to start. The
> > > >> shutdown works fine, and I don't see any problems on the Xen
> console.
> > > I'm
> > > >> using an older build of Xen but still one with my series in. I'll try 
> > > >> pulling
> > up
> > > to
> > > >> the same commit as you and try again.
> > > >>
> > > >>>   Paul
> > > >>
> > > >> Hi Paul,
> > > >>
> > > >> From what i recall it started around when Tiejun Chen's series went in.
> > > >>
> > >
> > > Since I can reproduce this at will I will attempt to perform a
> > > bisection. Maybe this can help narrow down the issue.
> > >
> >
> > Thanks. That would be very helpful. I will continue to try to repro.
> >
> 
> Still no luck with the repro but I think I might my thought experiments might
> have got it...
> 
> If a vcpu has a request in-flight then its internal ioreq state will be
> IOREQ_READY and it will be waiting for wake-up. When it is woken up
> hvm_do_resume() will be called and it will call hvm_wait_for_io(). If the
> shared (with QEMU) ioreq state is still IOREQ_READY or IOREQ_INPROCESS
> then the vcpu will block again. If the shared state is IORESP_READY then the
> emulation is done and the internal state will be updated to IORESP_READY or
> IOREQ_NONE by hvm_io_assist() depending upon whether any completion
> is needed or not.
> *However* if the emulator (or Xen) happens to zero out the shared ioreq
> state before hvm_wait_for_io() is called then it will see a shared state of
> IOREQ_NONE so it will terminate without calling hvm_io_assist() leaving the
> internal ioreq state as IOREQ_READY which will then cause the
> domain_crash() you're seeing when re-emulation is attempted by a
> completion handler.
> 
> So, there is an underlying problem in that a dying emulator can leave an I/O
> uncompleted but the code in Xen needs to cope more gracefully with that
> (since the vcpu will be going away anyway) and not call domain_crash().
> 

Can you please try this patch:

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ec1d797..197a8c4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -412,44 +412,52 @@ bool_t hvm_io_pending(struct vcpu *v)
                           &d->arch.hvm_domain.ioreq_server.list,
                           list_entry )
     {
-        ioreq_t *p = get_ioreq(s, v);
+        struct hvm_ioreq_vcpu *sv;

-        if ( p->state != STATE_IOREQ_NONE )
-            return 1;
+        list_for_each_entry ( sv,
+                              &s->ioreq_vcpu_list,
+                              list_entry )
+        {
+            if ( sv->vcpu == v && sv->pending )
+                return 1;
+        }
     }

     return 0;
 }

-static void hvm_io_assist(ioreq_t *p)
+static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, uint64_t data)
 {
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-
-    p->state = STATE_IOREQ_NONE;
+    struct vcpu *v = sv->vcpu;
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;

     if ( hvm_vcpu_io_need_completion(vio) )
     {
         vio->io_req.state = STATE_IORESP_READY;
-        vio->io_req.data = p->data;
+        vio->io_req.data = data;
     }
     else
         vio->io_req.state = STATE_IOREQ_NONE;

-    msix_write_completion(curr);
-    vcpu_end_shutdown_deferral(curr);
+    msix_write_completion(v);
+    vcpu_end_shutdown_deferral(v);
+
+    sv->pending = 0;
 }

 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
-    /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
-    while ( p->state != STATE_IOREQ_NONE )
+    while ( sv->pending )
     {
         switch ( p->state )
         {
+        case STATE_IOREQ_NONE:
+            hvm_io_assist(sv, ~0ul);
+            break;
         case STATE_IORESP_READY: /* IORESP_READY -> NONE */
             rmb(); /* see IORESP_READY /then/ read contents of ioreq */
-            hvm_io_assist(p);
+            p->state = STATE_IOREQ_NONE;
+            hvm_io_assist(sv, p->data);
             break;
         case STATE_IOREQ_READY:  /* IOREQ_{READY,INPROCESS} -> IORESP_READY */
         case STATE_IOREQ_INPROCESS:
@@ -459,6 +467,7 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ior
             break;
         default:
             gdprintk(XENLOG_ERR, "Weird HVM iorequest state %d.\n", p->state);
+            sv->pending = 0;
             domain_crash(sv->vcpu->domain);
             return 0; /* bail */
         }
@@ -489,7 +498,7 @@ void hvm_do_resume(struct vcpu *v)
                               &s->ioreq_vcpu_list,
                               list_entry )
         {
-            if ( sv->vcpu == v )
+            if ( sv->vcpu == v && sv->pending )
             {
                 if ( !hvm_wait_for_io(sv, get_ioreq(s, v)) )
                     return;
@@ -2743,6 +2752,8 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *pr
              */
             p->state = STATE_IOREQ_READY;
             notify_via_xen_event_channel(d, port);
+
+            sv->pending = 1;
             return X86EMUL_RETRY;
         }
     }
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index b612755..12b5e0c 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -46,6 +46,7 @@ struct hvm_ioreq_vcpu {
     struct list_head list_entry;
     struct vcpu      *vcpu;
     evtchn_port_t    ioreq_evtchn;
+    bool_t           pending;
 };

 #define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
--

  Paul

>   Paul
> 
> 
> >   Paul
> >
> > > Roger.
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.