[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH for-4.13 2/2] x86/ioapic: don't use raw entry reads/writes in clear_IO_APIC_pin
On 08.11.2019 17:07, Roger Pau Monné wrote: > On Fri, Nov 08, 2019 at 11:16:26AM +0100, Jan Beulich wrote: >> On 08.11.2019 10:27, Roger Pau Monné wrote: >>> On Thu, Nov 07, 2019 at 04:56:09PM +0100, Jan Beulich wrote: >>>> On 07.11.2019 16:46, Roger Pau Monné wrote: >>>>> On Thu, Nov 07, 2019 at 04:28:56PM +0100, Jan Beulich wrote: >>>>>> On 07.11.2019 16:06, Roger Pau Monne wrote: >>>>>>> @@ -530,9 +530,9 @@ static void clear_IO_APIC_pin(unsigned int apic, >>>>>>> unsigned int pin) >>>>>>> */ >>>>>>> memset(&entry, 0, sizeof(entry)); >>>>>>> entry.mask = 1; >>>>>>> - __ioapic_write_entry(apic, pin, true, entry); >>>>>>> + __ioapic_write_entry(apic, pin, false, entry); >>>>>> >>>>>> I may be able to understand why this one can't use raw mode, but as >>>>>> per above a better overall description is needed. >>>>> >>>>> Yes, this is the one that's actually incorrect, but see my reasoning >>>>> below. >>>>> >>>>>> >>>>>>> - entry = __ioapic_read_entry(apic, pin, true); >>>>>>> + entry = __ioapic_read_entry(apic, pin, false); >>>>>>> if (entry.irr) >>>>>>> printk(KERN_ERR "IO-APIC%02x-%u: Unable to reset IRR\n", >>>>>>> IO_APIC_ID(apic), pin); >>>>>> >>>>>> This read again shouldn't need conversion, as the IRR bit doesn't >>>>>> get touched (I think) by the interrupt remapping code during the >>>>>> translation it does. >>>>> >>>>> TBH, I think raw mode should only be used by the iommu code in order >>>>> to setup the entries to point to the interrupt remapping table, >>>>> everything else shouldn't be using raw mode. While it's true that some >>>>> of the cases here are safe to use raw mode I would discourage it's >>>>> usage as it can lead to issues, and this is not a performance critical >>>>> path anyway. >>>> >>>> You also should take the other possible perspective - not using >>>> raw mode means going through interrupt remapping logic, which >>>> can (needlessly) trigger errors. I think you want to break the >>>> patch into a necessary and an optional part. The optional part >>>> should be discussed separately and deferred until after 4.13. >>> >>> IMO generic IO-APIC code has not business playing with raw entries >>> when interrupt remapping is enabled, the layout of IO-APIC entries in >>> that case is vendor-specific, and hence the generic IO-APIC code is >>> not able to parse it. >>> >>> For example the code in clear_IO_APIC_pin modifies the mask or the >>> trigger fields of RAW entries, is there any guarantee that those >>> fields don't have different meanings/layout when interrupt remapping >>> is enabled? >> >> From an abstract pov there's no such guarantee, but in practice >> the meaning of the fields doesn't change. You make a good point >> though nevertheless: For VT-d the trigger mode fields in RTE and >> IRTE need to match, so the interrupt remapping code needs to see >> the trigger mode change. See below for a possible alternative >> patch. >> >>> I can split the specific bugfix into a separate patch, but IMO the >>> code in clear_IO_APIC_pin is not safe. >> >> A change is needed, yes, but in particular because of the use of >> the function from clear_IO_APIC(), in turn called from >> disable_IO_APIC(), yet in turn used e.g. during emergency >> shutdown after a crash, I'd like the function to do as simple >> operations as possible, i.e. specifically avoid going through >> interrupt remapping code (because its data structures may also >> be corrupted at that point) unless really needed (hence the >> alternative patch suggestion below). > > Isn't just masking the entries fine when disabling the IO-APIC, or a > full wipe of all the entries is required? Just masking the RTE _should_ be fine (but you never know). > I would be fine with having a maskall_IO_APIC function that reads > entries in raw format, sets the mask bit and writes the entry back in > raw format as long as it's annotated that this is done in order to > limit as much a possible the chances of hitting corrupted data in the > crash case. Right, certainly something we may want to do for 4.14. >> As an aside, iommu_crash_shutdown() - even if actually doing >> something, i.e. disabling interrupt remapping - does _not_ >> cause the RTEs to be re-written in non-translated format. > > I'm not sure whether that will work, it's possible that some entries > can't be translated because they use x2APIC IDs, and thus don't fit in > a non-translated IO-APIC entry. And of course there's no expectation that interrupts would still work, but any inspection (e.g. via dumping) of the RTEs would be misleading at that point. >> --- unstable.orig/xen/arch/x86/io_apic.c >> +++ unstable/xen/arch/x86/io_apic.c >> @@ -519,8 +519,9 @@ static void clear_IO_APIC_pin(unsigned i >> if (entry.irr) { >> /* Make sure the trigger mode is set to level. */ >> if (!entry.trigger) { >> + entry = __ioapic_read_entry(apic, pin, false); >> entry.trigger = 1; >> - __ioapic_write_entry(apic, pin, TRUE, entry); >> + __ioapic_write_entry(apic, pin, false, entry); >> } >> __io_apic_eoi(apic, entry.vector, pin); >> } >> @@ -530,7 +531,7 @@ static void clear_IO_APIC_pin(unsigned i >> */ >> memset(&entry, 0, sizeof(entry)); >> entry.mask = 1; >> - __ioapic_write_entry(apic, pin, TRUE, entry); >> + __ioapic_write_entry(apic, pin, false, entry); >> >> entry = __ioapic_read_entry(apic, pin, TRUE); >> if (entry.irr) > > Well, this is certainly better than what's there currently, and should > fix the issue reported by Sergey, albeit I still think checking the > irr or the trigger fields of a raw entry when using interrupt > remapping is not safe future wise. > > There's no guarantee that future interrupt remapping implementations > don't clobber the non-translated fields with different ones when using > a remapped IO-APIC entry, and hence while this fixes the current issue > at hand it seems fragile. > > Anyway, I think this should be fixed ASAP, so if you are happy with > this version that's fine for me. Do you want me to pick this up and > rebase it on top of my TRUE/FALSE removal patch, or would you rather > send it formally standalone? I'd appreciate you making this a v2 of your original patch, ideally with a further improved description. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |