[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Patch] x86/HVM: Fix RTC interrupt modelling
On 10/02/14 12:17, Andrew Cooper wrote: > This reverts large amounts of: > 9607327abbd3e77bde6cc7b5327f3efd781fc06e > "x86/HVM: properly handle RTC periodic timer even when !RTC_PIE" > 620d5dad54008e40798c4a0c4322aef274c36fa3 > "x86/HVM: assorted RTC emulation adjustments" > > and by extentsion: > f3347f520cb4d8aa4566182b013c6758d80cbe88 > "x86/HVM: adjust IRQ (de-)assertion" > c2f79c464849e5f796aa9d1d0f26fe356abd1a1a > "x86/HVM: fix processing of RTC REG_B writes" > 527824f41f5fac9cba3d4441b2e73d3118d98837 > "x86/hvm: Centralize and simplify the RTC IRQ logic." > > The current code has a pathological case, tickled by the access pattern of > Windows 2003 Server SP2. Occasonally on boot (which I presume is during a > time calibration against the RTC Periodic Timer), Windows gets stuck in an > infinite loop reading RTC REG_C. This affects 32 and 64 bit guests. > > In the pathological case, the VM state looks like this: > * RTC: 64Hz period, periodic interrupts enabled > * RTC_IRQ in IOAPIC as vector 0xd1, edge triggered, not pending > * vector 0xd1 set in LAPIC IRR and ISR, TPR at 0xd0 > * Reads from REG_C return 'RTC_PF | RTC_IRQF' > > With an intstrumented Xen, dumping the periodic timers with a guest in this > state shows a single timer with pt->irq_issued=1 and pt->pending_intr_nr=2. > > Windows is presumably waiting for reads of REG_C to drop to 0, and reading > REG_C clears the value each time in the emulated RTC. However: > > * {svm,vmx}_intr_assist() calls pt_update_irq() unconditionally. > * pt_update_irq() always finds the RTC as earliest_pt. > * rtc_periodic_interrupt() unconditionally sets RTC_PF in no_ack mode. It > returns true, indicating that pt_update_irq() should really inject the > interrupt. > * pt_update_irq() decides that it doesn't need to fake up part of > pt_intr_post() because this is a real interrupt. > * {svm,vmx}_intr_assist() can't inject the interrupt as it is already > pending, so exits early without calling pt_intr_post(). > > The underlying problem here comes because the AF and UF bits of RTC interrupt > state is modelled by the RTC code, but the PF is modelled by the pt code. The > root cause of windows infinite loop is that RTC_PF is being re-set on vmentry > before the interrupt logic has worked out that it can't actually inject an RTC > interrupt, causing Windows to erroniously read (RTC_PF|RTC_IRQF) when it > should be reading 0. > > This patch reverts the RTC_PF logic handling to its former state, whereby > rtc_periodic_cb() is called strictly when the periodic timer logic has > successfully injected a periodic interrupt. In doing so, it is important that > the RTC code itself never directly triggers an interrupt for the periodic > timer (other than the case when setting REG_B.PIE, where the pt code will have > dropped the interrupt). > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Signed-off-by: Tim Deegan <tim@xxxxxxx> > CC: Keir Fraser <keir@xxxxxxx> > CC: Jan Beulich <JBeulich@xxxxxxxx> > CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx> > CC: Roger Pau Monnà <roger.pau@xxxxxxxxxx> > --- > > I still dont know exactly what condition causes windows to tickle this > behavour. It is seen about 1 or 2 times in 9 tests running a 12 hour VM > lifecycle test. Over the weekend, 100 of these tests have passed without a > single reoccurence of the infinite loop. The change has also passed a windows > extended regression test, so it would appear that other versions of windows > are still fine with the change. > > Roger: as this caused issues for FreeBSD, would you mind testing it again > please? Tested-by: Roger Pau Monnà <roger.pau@xxxxxxxxxx> On FreeBSD 10.0, 9.2 and 8.4 No apparent regressions AFAICT. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |