[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Emulation issues



> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
> Sent: 31 July 2015 12:43
> To: Roger Pau Monne; Sander Eikelenboom
> Cc: Andrew Cooper; xen-devel
> Subject: Re: [Xen-devel] [BUG] Emulation issues
> 
> > -----Original Message-----
> > From: Roger Pau Monné [mailto:roger.pau@xxxxxxxxxx]
> > Sent: 31 July 2015 12:42
> > To: Paul Durrant; Sander Eikelenboom
> > Cc: Andrew Cooper; xen-devel
> > Subject: Re: [Xen-devel] [BUG] Emulation issues
> >
> > El 31/07/15 a les 13.39, Paul Durrant ha escrit:
> > >> -----Original Message-----
> > >> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx]
> > >> Sent: 31 July 2015 12:12
> > >> To: Paul Durrant
> > >> Cc: Andrew Cooper; Roger Pau Monne; xen-devel
> > >> Subject: Re: [Xen-devel] [BUG] Emulation issues
> > >>
> > >>
> > >> Friday, July 31, 2015, 12:22:16 PM, you wrote:
> > >>
> > >>>> -----Original Message-----
> > >>>> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> > >>>> bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
> > >>>> Sent: 30 July 2015 14:20
> > >>>> To: Andrew Cooper; Roger Pau Monne; xen-devel
> > >>>> Subject: Re: [Xen-devel] [BUG] Emulation issues
> > >>>>
> > >>>>> -----Original Message-----
> > >>>>> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
> > >>>>> Sent: 30 July 2015 14:19
> > >>>>> To: Paul Durrant; Roger Pau Monne; xen-devel
> > >>>>> Subject: Re: [BUG] Emulation issues
> > >>>>>
> > >>>>> On 30/07/15 14:12, Paul Durrant wrote:
> > >>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
> > >>>>>>> (XEN) domain_crash called from io.c:166
> > >>>>>>> (XEN) d19v0 weird emulation state 1
> > >>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
> > >>>>>>> (XEN) domain_crash called from io.c:166
> > >>>>>>> (XEN) d19v0 weird emulation state 1
> > >>>>>>> (XEN) io.c:165:d19v0 Weird HVM ioemulation status 1.
> > >>>>>>> (XEN) domain_crash called from io.c:166
> > >>>>>>>
> > >>>>>> Hmm. Can't understand how that's happening... handle_pio()
> > >> shouldn't
> > >>>> be
> > >>>>> called unless the state is STATE_IORESP_READY and yet the inner
> > >> function
> > >>>> is
> > >>>>> hitting the default case in the switch.
> > >>>>>
> > >>>>> Sounds like something is changing the state between the two
> checks.
> > Is
> > >>>>> this shared memory writeable by qemu?
> > >>>>>
> > >>>>
> > >>>> No, this is the internal state. I really can't see how it's being 
> > >>>> changed.
> > >>>>
> > >>
> > >>> I've tried to replicate your test on my rig (which is an old AMD box but
> > quite
> > >> a big one). Even so I only seem to get about half the VMs to start. The
> > >> shutdown works fine, and I don't see any problems on the Xen console.
> > I'm
> > >> using an older build of Xen but still one with my series in. I'll try 
> > >> pulling
> up
> > to
> > >> the same commit as you and try again.
> > >>
> > >>>   Paul
> > >>
> > >> Hi Paul,
> > >>
> > >> From what i recall it started around when Tiejun Chen's series went in.
> > >>
> >
> > Since I can reproduce this at will I will attempt to perform a
> > bisection. Maybe this can help narrow down the issue.
> >
> 
> Thanks. That would be very helpful. I will continue to try to repro.
> 

Still no luck with the repro but I think I might my thought experiments might 
have got it...

If a vcpu has a request in-flight then its internal ioreq state will be 
IOREQ_READY and it will be waiting for wake-up. When it is woken up 
hvm_do_resume() will be called and it will call hvm_wait_for_io(). If the 
shared (with QEMU) ioreq state is still IOREQ_READY or IOREQ_INPROCESS then the 
vcpu will block again. If the shared state is IORESP_READY then the emulation 
is done and the internal state will be updated to IORESP_READY or IOREQ_NONE by 
hvm_io_assist() depending upon whether any completion is needed or not.
*However* if the emulator (or Xen) happens to zero out the shared ioreq state 
before hvm_wait_for_io() is called then it will see a shared state of 
IOREQ_NONE so it will terminate without calling hvm_io_assist() leaving the 
internal ioreq state as IOREQ_READY which will then cause the domain_crash() 
you're seeing when re-emulation is attempted by a completion handler.

So, there is an underlying problem in that a dying emulator can leave an I/O 
uncompleted but the code in Xen needs to cope more gracefully with that (since 
the vcpu will be going away anyway) and not call domain_crash().

  Paul


>   Paul
> 
> > Roger.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.