[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch] x86/HVM: Fix RTC interrupt modelling

At 13:59 +0000 on 11 Feb (1392123546), Andrew Cooper wrote:
> On 11/02/14 13:15, Tim Deegan wrote:
> > At 12:50 +0000 on 11 Feb (1392119457), Jan Beulich wrote:
> >>>>> On 11.02.14 at 13:11, Tim Deegan <tim@xxxxxxx> wrote:
> >>> At 09:15 +0000 on 11 Feb (1392106520), Jan Beulich wrote:
> >>>>>>> On 10.02.14 at 18:21, Tim Deegan <tim@xxxxxxx> wrote:
> >>>>> That is the main change of this cset:  we go back to driving
> >>>>> the interrupt from the vpt code and fixing up the RTC state after vpt
> >>>>> tells us it's injected an interrupt.
> >>>> And that's what is wrong imo, as it doesn't allow driving PF correctly
> >>>> when !PIE.
> >>> Oh, I see -- the current code doesn't turn the vpt off when !PIE.  Can
> >>> you remember why not?  Have I forgotten some wrinkle or race here?
> >> Because an OS could inspect PF without setting PIE.
> > Ugh. :( 
> >
> >>>>> Yeah, this has nothing to do with the bug being fixed here.  The old
> >>>>> REG_C read was operating correctly, but on the return-to-guest path:
> >>>>>  - vpt sees another RTC interrupt is due and calls RTC code
> >>>>>  - RTC code sees REG_C clear, sets PF|IRQF and asserts the line
> >>>>>  - vlapic code sees the last interrupt is still in the ISR and does
> >>>>>    nothing;
> >>>>>  - we return to the guest having set IRQF but not consumed a timer
> >>>>>    event, so vpt stste is the same
> >>>>>  - the guest sees the old REG_C, with PF|IRQF set, and re-reads, 
> >>>>>    waiting for a read of 0.
> >>>>>  - repeat forever.
> >>>> Which would call for a flag suppressing the setting of PF|IRQF
> >>>> until the timer event got consumed. Possibly with some safety
> >>>> belt for this to not get deferred indefinitely (albeit if the interrupt
> >>>> doesn't get injected for extended periods of time, the guest
> >>>> would presumably have more severe problems than these flags
> >>>> not getting updated as expected).
> >>> That's pretty much what we're doing here -- the pt_intr_post callback
> >>> sets PF|IRQF when the interrupt is injected.
> >> Right, except you do this be reverting other stuff rather than
> >> adding the missing functionality on top.
> > Absolutely -- because once we went back to having PF set only when the
> > interrupt was injected, it seemed better to reduce the amount of
> > special-case plumbing for RTC than to add yet more.
> >
> > But for the case of an OS polling for PF with PIE clear, I guess we
> > might need to keep all the current special cases.  Was that a known
> > observed bug or a theoretical one?  I can't see a way of handling
> > both that case and the w2k3 case.
> >
> > Either we always set PF when the tick happens, even if the interrupt
> > is masked (which breaks w2k3) or we don't set it until we can deliver
> > the interrupt (which breaks pollers).
> This doesn't break w2k3.  Setting PF when a tick happens (or should
> happen for !PIE) is the correct thing to do.
> The bug is that we see an interrupt pending and set PF when we
> shouldn't

We _are_ setting PF when the tick happens; it's just that because of
no-missed-ticks mode the tick happens before w2k3 has finished
handling the last one.  At that point, anything we do breaks w2k3 in
some way -- either we leave the tick pending until the interrupt is
actually delivered (which leads to the hang) or we consume the tick
even though the interrupt will be lost (which causes clock drift).


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.