|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/i8259: do not assume interrupts always target CPU0
On Thu, Oct 26, 2023 at 09:59:42AM +0200, Jan Beulich wrote:
> On 24.10.2023 16:53, Roger Pau Monne wrote:
> > Sporadically we have seen the following during AP bringup on AMD platforms
> > only:
> >
> > microcode: CPU59 updated from revision 0x830107a to 0x830107a, date =
> > 2023-05-17
> > microcode: CPU60 updated from revision 0x830104d to 0x830107a, date =
> > 2023-05-17
> > CPU60: No irq handler for vector 27 (IRQ -2147483648)
> > microcode: CPU61 updated from revision 0x830107a to 0x830107a, date =
> > 2023-05-17
> >
> > This is similar to the issue raised on Linux commit 36e9e1eab777e, where
> > they
> > observed i8259 (active) vectors getting delivered to CPUs different than 0.
> >
> > On AMD or Hygon platforms adjust the target CPU mask of i8259 interrupt
> > descriptors to contain all possible CPUs, so that APs will reserve the
> > vector
> > at startup if any legacy IRQ is still delivered through the i8259. Note
> > that
> > if the IO-APIC takes over those interrupt descriptors the CPU mask will be
> > reset.
> >
> > Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected
> > even
> > when all i8259 pins are masked, and hence would need to be handled on all
> > CPUs.
> >
> > Continue to reserve PIC vectors on CPU0 only, but do check for such spurious
> > interrupts on all CPUs if the vendor is AMD or Hygon. Note that once the
> > vectors get used by devices detecting PIC spurious interrupts will no
> > longer be
> > possible, however the device driver should be able to cope with spurious
> > interrupts. Such PIC spurious interrupts occurring when the vector is in
> > use
> > by a local APIC routed source will lead to an extra EOI, which might
> > unintentionally clear a different vector from ISR. Note this is already the
> > current behavior, so assume it's infrequent enough to not cause real issues.
> >
> > Finally, adjust the printed message to display the CPU where the spurious
> > interrupt has been received, so it looks like:
> >
> > microcode: CPU1 updated from revision 0x830107a to 0x830107a, date =
> > 2023-05-17
> > cpu1: spurious 8259A interrupt: IRQ7
> > microcode: CPU2 updated from revision 0x830104d to 0x830107a, date =
> > 2023-05-17
> >
> > Amends: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> with one nit (which I think can be taken care of when committing):
>
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -1920,7 +1920,16 @@ void do_IRQ(struct cpu_user_regs *regs)
> > kind = "";
> > if ( !(vector >= FIRST_LEGACY_VECTOR &&
> > vector <= LAST_LEGACY_VECTOR &&
> > - !smp_processor_id() &&
> > + (!smp_processor_id() ||
> > + /*
> > + * For AMD/Hygon do spurious PIC interrupt
> > + * detection on all CPUs, as it has been observed
> > + * that during unknown circumstances spurious PIC
> > + * interrupts have been delivered to CPUs
> > + * different than the BSP.
> > + */
> > + (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
> > + X86_VENDOR_HYGON))) &&
>
> Afaict these two lines need indenting by one more blank, to account
> for the parentheses enclosing the || operands.
Indeed, please adjust at commit if you don't mind.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |