[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Emulation issues



El 31/07/15 a les 16.19, Paul Durrant ha escrit:
>> -----Original Message-----
>> From: Paul Durrant
>> Sent: 31 July 2015 13:21
>> To: Paul Durrant; Roger Pau Monne; Sander Eikelenboom
>> Cc: Andrew Cooper; xen-devel
>> Subject: RE: [Xen-devel] [BUG] Emulation issues
>>
>>> -----Original Message-----
>>> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
>>> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
>>> Sent: 31 July 2015 12:43
>>> To: Roger Pau Monne; Sander Eikelenboom
>>> Cc: Andrew Cooper; xen-devel
>>> Subject: Re: [Xen-devel] [BUG] Emulation issues
>>>
>>>> -----Original Message-----
>>>> From: Roger Pau Monné [mailto:roger.pau@xxxxxxxxxx]
>>>> Sent: 31 July 2015 12:42
>>>> To: Paul Durrant; Sander Eikelenboom
>>>> Cc: Andrew Cooper; xen-devel
>>>> Subject: Re: [Xen-devel] [BUG] Emulation issues
>>>>
>>>> El 31/07/15 a les 13.39, Paul Durrant ha escrit:
>>>>>> -----Original Message-----
>>>>>> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx]
>>>>>> Sent: 31 July 2015 12:12
>>>>>> To: Paul Durrant
>>>>>> Cc: Andrew Cooper; Roger Pau Monne; xen-devel
>>>>>> Subject: Re: [Xen-devel] [BUG] Emulation issues
>>>>>>
>>>>>>
>>>>>> Friday, July 31, 2015, 12:22:16 PM, you wrote:
>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
>>>>>>>> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
>>>>>>>> Sent: 30 July 2015 14:20
>>>>>>>> To: Andrew Cooper; Roger Pau Monne; xen-devel
>>>>>>>> Subject: Re: [Xen-devel] [BUG] Emulation issues
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
>>>>>>>>> Sent: 30 July 2015 14:19
>>>>>>>>> To: Paul Durrant; Roger Pau Monne; xen-devel
>>>>>>>>> Subject: Re: [BUG] Emulation issues
>>>>>>>>>
>>>>>>>>> On 30/07/15 14:12, Paul Durrant wrote:
>>>>>>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
>>>>>>>>>>> (XEN) domain_crash called from io.c:166
>>>>>>>>>>> (XEN) d19v0 weird emulation state 1
>>>>>>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
>>>>>>>>>>> (XEN) domain_crash called from io.c:166
>>>>>>>>>>> (XEN) d19v0 weird emulation state 1
>>>>>>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
>>>>>>>>>>> (XEN) domain_crash called from io.c:166
>>>>>>>>>>>
>>>>>>>>>> Hmm. Can't understand how that's happening... handle_pio()
>>>>>> shouldn't
>>>>>>>> be
>>>>>>>>> called unless the state is STATE_IORESP_READY and yet the inner
>>>>>> function
>>>>>>>> is
>>>>>>>>> hitting the default case in the switch.
>>>>>>>>>
>>>>>>>>> Sounds like something is changing the state between the two
>>> checks.
>>>> Is
>>>>>>>>> this shared memory writeable by qemu?
>>>>>>>>>
>>>>>>>>
>>>>>>>> No, this is the internal state. I really can't see how it's being
>> changed.
>>>>>>>>
>>>>>>
>>>>>>> I've tried to replicate your test on my rig (which is an old AMD box
>> but
>>>> quite
>>>>>> a big one). Even so I only seem to get about half the VMs to start. The
>>>>>> shutdown works fine, and I don't see any problems on the Xen
>> console.
>>>> I'm
>>>>>> using an older build of Xen but still one with my series in. I'll try 
>>>>>> pulling
>>> up
>>>> to
>>>>>> the same commit as you and try again.
>>>>>>
>>>>>>>   Paul
>>>>>>
>>>>>> Hi Paul,
>>>>>>
>>>>>> From what i recall it started around when Tiejun Chen's series went in.
>>>>>>
>>>>
>>>> Since I can reproduce this at will I will attempt to perform a
>>>> bisection. Maybe this can help narrow down the issue.
>>>>
>>>
>>> Thanks. That would be very helpful. I will continue to try to repro.
>>>
>>
>> Still no luck with the repro but I think I might my thought experiments might
>> have got it...
>>
>> If a vcpu has a request in-flight then its internal ioreq state will be
>> IOREQ_READY and it will be waiting for wake-up. When it is woken up
>> hvm_do_resume() will be called and it will call hvm_wait_for_io(). If the
>> shared (with QEMU) ioreq state is still IOREQ_READY or IOREQ_INPROCESS
>> then the vcpu will block again. If the shared state is IORESP_READY then the
>> emulation is done and the internal state will be updated to IORESP_READY or
>> IOREQ_NONE by hvm_io_assist() depending upon whether any completion
>> is needed or not.
>> *However* if the emulator (or Xen) happens to zero out the shared ioreq
>> state before hvm_wait_for_io() is called then it will see a shared state of
>> IOREQ_NONE so it will terminate without calling hvm_io_assist() leaving the
>> internal ioreq state as IOREQ_READY which will then cause the
>> domain_crash() you're seeing when re-emulation is attempted by a
>> completion handler.
>>
>> So, there is an underlying problem in that a dying emulator can leave an I/O
>> uncompleted but the code in Xen needs to cope more gracefully with that
>> (since the vcpu will be going away anyway) and not call domain_crash().
>>
> 
> Can you please try this patch:
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ec1d797..197a8c4 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -412,44 +412,52 @@ bool_t hvm_io_pending(struct vcpu *v)
>                            &d->arch.hvm_domain.ioreq_server.list,
>                            list_entry )
>      {
> -        ioreq_t *p = get_ioreq(s, v);
> +        struct hvm_ioreq_vcpu *sv;
> 
> -        if ( p->state != STATE_IOREQ_NONE )
> -            return 1;
> +        list_for_each_entry ( sv,
> +                              &s->ioreq_vcpu_list,
> +                              list_entry )
> +        {
> +            if ( sv->vcpu == v && sv->pending )
> +                return 1;
> +        }
>      }
> 
>      return 0;
>  }
> 
> -static void hvm_io_assist(ioreq_t *p)
> +static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, uint64_t data)
>  {
> -    struct vcpu *curr = current;
> -    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
> -
> -    p->state = STATE_IOREQ_NONE;
> +    struct vcpu *v = sv->vcpu;
> +    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
> 
>      if ( hvm_vcpu_io_need_completion(vio) )
>      {
>          vio->io_req.state = STATE_IORESP_READY;
> -        vio->io_req.data = p->data;
> +        vio->io_req.data = data;
>      }
>      else
>          vio->io_req.state = STATE_IOREQ_NONE;
> 
> -    msix_write_completion(curr);
> -    vcpu_end_shutdown_deferral(curr);
> +    msix_write_completion(v);
> +    vcpu_end_shutdown_deferral(v);
> +
> +    sv->pending = 0;
>  }
> 
>  static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
>  {
> -    /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
> -    while ( p->state != STATE_IOREQ_NONE )
> +    while ( sv->pending )
>      {
>          switch ( p->state )
>          {
> +        case STATE_IOREQ_NONE:
> +            hvm_io_assist(sv, ~0ul);
> +            break;
>          case STATE_IORESP_READY: /* IORESP_READY -> NONE */
>              rmb(); /* see IORESP_READY /then/ read contents of ioreq */
> -            hvm_io_assist(p);
> +            p->state = STATE_IOREQ_NONE;
> +            hvm_io_assist(sv, p->data);
>              break;
>          case STATE_IOREQ_READY:  /* IOREQ_{READY,INPROCESS} -> IORESP_READY 
> */
>          case STATE_IOREQ_INPROCESS:
> @@ -459,6 +467,7 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, 
> ior
>              break;
>          default:
>              gdprintk(XENLOG_ERR, "Weird HVM iorequest state %d.\n", 
> p->state);
> +            sv->pending = 0;
>              domain_crash(sv->vcpu->domain);
>              return 0; /* bail */
>          }
> @@ -489,7 +498,7 @@ void hvm_do_resume(struct vcpu *v)
>                                &s->ioreq_vcpu_list,
>                                list_entry )
>          {
> -            if ( sv->vcpu == v )
> +            if ( sv->vcpu == v && sv->pending )
>              {
>                  if ( !hvm_wait_for_io(sv, get_ioreq(s, v)) )
>                      return;
> @@ -2743,6 +2752,8 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t 
> *pr
>               */
>              p->state = STATE_IOREQ_READY;
>              notify_via_xen_event_channel(d, port);
> +
> +            sv->pending = 1;
>              return X86EMUL_RETRY;
>          }
>      }
> diff --git a/xen/include/asm-x86/hvm/domain.h 
> b/xen/include/asm-x86/hvm/domain.h
> index b612755..12b5e0c 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -46,6 +46,7 @@ struct hvm_ioreq_vcpu {
>      struct list_head list_entry;
>      struct vcpu      *vcpu;
>      evtchn_port_t    ioreq_evtchn;
> +    bool_t           pending;
>  };
> 
>  #define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> --

Thanks, this solves the issue for me, I've been able to shutdown +40 HVM
guests without issues. You can add my Tested-by when you formally post
the patch.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.