[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch] x86/HVM: Fix RTC interrupt modelling

At 16:34 +0000 on 10 Feb (1392046444), Jan Beulich wrote:
> > The underlying problem here comes because the AF and UF bits of RTC 
> > interrupt
> > state is modelled by the RTC code, but the PF is modelled by the pt code.  
> > The
> > root cause of windows infinite loop is that RTC_PF is being re-set on 
> > vmentry
> > before the interrupt logic has worked out that it can't actually inject an 
> > RTC
> > interrupt, causing Windows to erroniously read (RTC_PF|RTC_IRQF) when it
> > should be reading 0.
> So you're undoing a whole lot of changes done with the goal of
> getting the overall emulation closer to what real hardware does,
> just to paper over an issue elsewhere in the code? Not really an
> approach I'm in favor of.

My understanding was that the problem is explicitly in the change to
how RTC code is called from vpt code.

Originally, the RTC callback was called from pt_intr_post, like other
vpt sources.  Your rework changed it to be called much earlier, when
the vpt was considering which time source to choose.  AIUI that was to
let the RTC code tell the VPT not to inject, if the guest hasn't acked
the last interrupt, right?

Since that was changed later to allow a certain number of dead ticks
before deciding to stop the timer chain, the decision no longer has to
be made so early -- we can allow one more IRQ to go in and then
disable it. 

That is the main change of this cset:  we go back to driving
the interrupt from the vpt code and fixing up the RTC state after vpt
tells us it's injected an interrupt.

> > +    if ( new_b & new_c & (RTC_PF | RTC_AF | RTC_UF) )
> > +        new_c |= RTC_IRQF;
> Without going back to reading the spec, iirc RTC_IRQF is a sticky
> bit cleared _only_ be REG_C reads, i.e. you shouldn't clear it
> earlier in the function and then conditionally set it here.

All those bits are sticky, so if IRQF was set before it will be now;
but true that we could drop the mask at the top of the function.
(+ again in rtc_irq_event)

> >  
> > -    /* IRQ is raised if any source is both raised & enabled */
> > -    if ( !(s->hw.cmos_data[RTC_REG_B] &
> > -           s->hw.cmos_data[RTC_REG_C] &
> > -           (RTC_PF | RTC_AF | RTC_UF)) )
> > -        return;
> > +    s->hw.cmos_data[RTC_REG_B] = new_b;
> > +    s->hw.cmos_data[RTC_REG_C] = new_c;
> >  
> > -    s->hw.cmos_data[RTC_REG_C] |= RTC_IRQF;
> > -    if ( rtc_mode_is(s, no_ack) )
> > -        hvm_isa_irq_deassert(vrtc_domain(s), RTC_IRQ);
> > -    hvm_isa_irq_assert(vrtc_domain(s), RTC_IRQ);
> > +    if ( new_c & RTC_IRQF )
> > +        rtc_toggle_irq(s);
> Which then implies that the condition here would also need to
> consider the old state of the flag.


> > @@ -647,9 +674,6 @@ static uint32_t rtc_ioport_read(RTCState *s, uint32_t 
> > addr)
> >      case RTC_REG_C:
> >          ret = s->hw.cmos_data[s->hw.cmos_index];
> >          s->hw.cmos_data[RTC_REG_C] = 0x00;
> > -        if ( (ret & RTC_IRQF) && !rtc_mode_is(s, no_ack) )
> > -            hvm_isa_irq_deassert(d, RTC_IRQ);
> Why? With RTC_IRQF going from 1 to 0, the interrupt line should
> get de-asserted.

After this patch the RTC model only sends edges (as mentioned in the
description, maybe not clearly enough).  I think that might be
caused by some confusion on my part.  I'll go into it a bit more below.

> > -        rtc_update_irq(s);
> So given the problem description, this would seem to be the most
> important part at a first glance. But looking more closely, I'm getting
> the impression that the call to rtc_update_irq() had no effect at all
> here anyway: The function would always bail on the second if() due
> to REG_C having got cleared a few lines up.

Yeah, this has nothing to do with the bug being fixed here.  The old
REG_C read was operating correctly, but on the return-to-guest path:
 - vpt sees another RTC interrupt is due and calls RTC code
 - RTC code sees REG_C clear, sets PF|IRQF and asserts the line
 - vlapic code sees the last interrupt is still in the ISR and does
 - we return to the guest having set IRQF but not consumed a timer
   event, so vpt stste is the same
 - the guest sees the old REG_C, with PF|IRQF set, and re-reads, 
   waiting for a read of 0.
 - repeat forever.

AFAICS this is caused by the disconnect where the RTC tries to
reinject but doesn't use up a tick.  A real RTC doesn't have that
problem, because it doesn't have all the scheduling artefacts that the
VPT code is intended to work around.

So IMO the right fix is to go back to letting the VPT code control IRQ
injection for the PF source, which is what this patch does.

(I don't consider it to be a revert of any particular csets, BTW --
there have been a lot of other improvements in the RTC as part of that
and later series that this leaves in place).

The switch to always sending edges is incidental, and I'd be happy to
get rid of it.  IIRC the reason we ended up with that is that when the
vpt code injects an interrupt, we need to set IRQF|PF in REG_C but
_not_ assert the line (which would cause a second interrupt).
Tracking that disconnect seemed tricky, esp. across save/restore. 

If you have any idea how we could sort that out, I'd be inclined to go
back to a level-triggered model (i.e. following the RTC spec) even in
no-ack mode.  



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.