[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch] x86/HVM: Fix RTC interrupt modelling



At 09:15 +0000 on 11 Feb (1392106520), Jan Beulich wrote:
> >>> On 10.02.14 at 18:21, Tim Deegan <tim@xxxxxxx> wrote:
> > At 16:34 +0000 on 10 Feb (1392046444), Jan Beulich wrote:
> >> > The underlying problem here comes because the AF and UF bits of RTC 
> >> > interrupt
> >> > state is modelled by the RTC code, but the PF is modelled by the pt 
> >> > code.  
> >> > The
> >> > root cause of windows infinite loop is that RTC_PF is being re-set on 
> >> > vmentry
> >> > before the interrupt logic has worked out that it can't actually inject 
> >> > an 
> >> > RTC
> >> > interrupt, causing Windows to erroniously read (RTC_PF|RTC_IRQF) when it
> >> > should be reading 0.
> >> 
> >> So you're undoing a whole lot of changes done with the goal of
> >> getting the overall emulation closer to what real hardware does,
> >> just to paper over an issue elsewhere in the code? Not really an
> >> approach I'm in favor of.
> > 
> > My understanding was that the problem is explicitly in the change to
> > how RTC code is called from vpt code.
> > 
> > Originally, the RTC callback was called from pt_intr_post, like other
> > vpt sources.  Your rework changed it to be called much earlier, when
> > the vpt was considering which time source to choose.  AIUI that was to
> > let the RTC code tell the VPT not to inject, if the guest hasn't acked
> > the last interrupt, right?
> 
> Not only, it was also a necessary pre-adjustment for the !PIE case
> to work (see below).
> 
> > Since that was changed later to allow a certain number of dead ticks
> > before deciding to stop the timer chain, the decision no longer has to
> > be made so early -- we can allow one more IRQ to go in and then
> > disable it. 
> > 
> > That is the main change of this cset:  we go back to driving
> > the interrupt from the vpt code and fixing up the RTC state after vpt
> > tells us it's injected an interrupt.
> 
> And that's what is wrong imo, as it doesn't allow driving PF correctly
> when !PIE.

Oh, I see -- the current code doesn't turn the vpt off when !PIE.  Can
you remember why not?  Have I forgotten some wrinkle or race here?

> >> > -        rtc_update_irq(s);
> >> 
> >> So given the problem description, this would seem to be the most
> >> important part at a first glance. But looking more closely, I'm getting
> >> the impression that the call to rtc_update_irq() had no effect at all
> >> here anyway: The function would always bail on the second if() due
> >> to REG_C having got cleared a few lines up.
> > 
> > Yeah, this has nothing to do with the bug being fixed here.  The old
> > REG_C read was operating correctly, but on the return-to-guest path:
> >  - vpt sees another RTC interrupt is due and calls RTC code
> >  - RTC code sees REG_C clear, sets PF|IRQF and asserts the line
> >  - vlapic code sees the last interrupt is still in the ISR and does
> >    nothing;
> >  - we return to the guest having set IRQF but not consumed a timer
> >    event, so vpt stste is the same
> >  - the guest sees the old REG_C, with PF|IRQF set, and re-reads, 
> >    waiting for a read of 0.
> >  - repeat forever.
> 
> Which would call for a flag suppressing the setting of PF|IRQF
> until the timer event got consumed. Possibly with some safety
> belt for this to not get deferred indefinitely (albeit if the interrupt
> doesn't get injected for extended periods of time, the guest
> would presumably have more severe problems than these flags
> not getting updated as expected).

That's pretty much what we're doing here -- the pt_intr_post callback
sets PF|IRQF when the interrupt is injected.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.