[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: x86/HVM: Linux'es apic_pending_intr_clear() warns about stale IRR
On 31/10/2022 15:55, Jan Beulich wrote: > Hello, > > quite likely this isn't new, but I've ended up noticing it only recently: > On an oldish system where I hand a HVM guest an SR-IOV NIC (not sure yet > whether that actually matters) all APs have that warning issued, with all > reported values zero except for the very first IRR one - that's 00080000. > Which is suspicious by itself, for naming vector 0x13, i.e. below 0x20 > and hence within CPU exception range. To be clear, these are the VM's APs ? > For one I wonder about their logic: The function is called after setting > TPR to 0x10, which prevents the handling of vectors below 0x20 (and in > particular their propagation from ISR to IRR, if my understanding of the > process is right and the convoluted and imo partly incomplete SDM > description hasn't confused me). Plus the function runs when IRQs are > still off, which is another reason why nothing would ever propagate from > IRR to ISR while the function performs it work. Yet a comment there says > > /* > * If the ISR map is not empty. ACK the APIC and run another round > * to verify whether a pending IRR has been unblocked and turned > * into a ISR. > */ > > suggesting IRR bits could "promote" to ISR ones. And this, to me, is the > only justification for warning about leftover IRR bits (whereas I > certainly agree that the logic should result in all clear ISR bits, and > hence warning when one is still set is appropriate). Both the SDM and APM are fairly clear that IRR only moves to ISR when the core accepts the interrupt. So I agree that nothing in IRR will actually move to ISR as described by the comment. > And then I got puzzled by our logic: vlapic_get_ppr() is called only by > vlapic_set_ppr(), vlapic_lowest_prio(), and vlapic_read_aligned(). Yet > in particular not by vlapic_has_pending_irq(). While it looks like we > don't really ignore TPR during delivery, this appears to be a strange > split approach: hvm_interrupt_blocked() checks TPR, whereas > vlapic_has_pending_irq() checks ISR. I wonder if subtle issues can't > result from that ... This is precisely why want the fine grain settings for APIC acceleration. I know for certain there's at least one bug here, because it still causes windows to explode on migrate. > Of course I'm yet to figure out how IRR bit 0x13 ends up being set in > the first place. 0x13 is a legal vector for incoming interrupts (for reasons of Windows using 0x1f for self-IPI.) Exception wise, it's #XF, which isn't very common. Xen could in principle have had an event delivery type error and tried to deliver an exception as an IRQ, and I don't think any of the safety assertions in hvm_inject_event() would have triggered in this case. I don't expect Linux will have deliberately IPI'd itself with that vector, but I suppose it's not impossible if it constructed an IPI from a badly initialised variable. Alternatively, that vector is in the PIC's default range, so we could have an emulation issue there. If I were you, I'd ensure the VM has 4 or fewer vCPUs, and set up %dr pointing at ISR[31:0] in the regs page. That will catch whatever happens to be writing into ISR, and the backtrace will probably be interesting. Just make sure you've disabled interrupt posting first, because that is the one source that will bypass the debugregs. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |