[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.20 v3 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs



On Tue, Feb 11, 2025 at 12:23:56PM +0100, Jan Beulich wrote:
> On 11.02.2025 12:02, Roger Pau Monne wrote:
> > The current shutdown logic in smp_send_stop() will disable the APs while
> > having interrupts enabled on the BSP or possibly other APs. On AMD systems
> > this can lead to local APIC errors:
> > 
> > APIC error on CPU0: 00(08), Receive accept error
> > 
> > Such error message can be printed in a loop, thus blocking the system from
> > rebooting.  I assume this loop is created by the error being triggered by
> > the console interrupt, which is further stirred by the ESR handler
> > printing to the console.
> > 
> > Intel SDM states:
> > 
> > "Receive Accept Error.
> > 
> > Set when the local APIC detects that the message it received was not
> > accepted by any APIC on the APIC bus, including itself. Used only on P6
> > family and Pentium processors."
> > 
> > So the error shouldn't trigger on any Intel CPU supported by Xen.
> > 
> > However AMD doesn't make such claims, and indeed the error is broadcast to
> > all local APICs when an interrupt targets a CPU that's already offline.
> > 
> > To prevent the error from stalling the shutdown process perform the
> > disabling of APs and the BSP local APIC with interrupts disabled on all
> > CPUs in the system, so that by the time interrupts are unmasked on the BSP
> > the local APIC is already disabled.  This can still lead to a spurious:
> > 
> > APIC error on CPU0: 00(00)
> > 
> > As a result of an LVT Error getting injected while interrupts are masked on
> > the CPU, and the vector only handled after the local APIC is already
> > disabled.  ESR reports 0 because as part of disable_local_APIC() the ESR
> > register is cleared.
> > 
> > Note the NMI crash path doesn't have such issue, because disabling of APs
> > and the caller local APIC is already done in the same contiguous region
> > with interrupts disabled.  There's a possible window on the NMI crash path
> > (nmi_shootdown_cpus()) where some APs might be disabled (and thus
> > interrupts targeting them raising "Receive accept error") before others APs
> > have interrupts disabled.  However the shutdown NMI will be handled,
> > regardless of whether the AP is processing a local APIC error, and hence
> > such interrupts will not cause the shutdown process to get stuck.
> > 
> > Remove the call to fixup_irqs() in smp_send_stop(): it doesn't achieve the
> > intended goal of moving all interrupts to the BSP anyway.  The logic in
> > fixup_irqs() will move interrupts whose affinity doesn't overlap with the
> > passed mask, but the movement of interrupts is done to any CPU set in
> > cpu_online_map.  As in the shutdown path fixup_irqs() is called before APs
> > are cleared from cpu_online_map this leads to interrupts being shuffled
> > around, but not assigned to the BSP exclusively.
> 
> Which would have been possible to address by changing to something like
> 
>         if ( !cpumask_intersects(mask, desc->affinity) )
>         {
>             break_affinity = true;
>             cpumask_copy(affinity, mask);
>         }
>         else
>             cpumask_and(affinity, mask, desc->affinity);
> 
> there, I guess.

Possibly, but note _assign_irq_vector() could also refuse to move the
interrupts if there's a pending movement and the current target CPU is
still set as online in cpu_online_map.  Overall I think going down
that route is way more complex.

> 
> > The Fixes tag is more of a guess than a certainty; it's possible the
> > previous sleep window in fixup_irqs() allowed any in-flight interrupt to be
> > delivered before APs went offline.  However fixup_irqs() was still
> > incorrectly used, as it didn't (and still doesn't) move all interrupts to
> > target the provided cpu mask.
> 
> Plus there's the vector shortage aspect, if everything was moved to the
> BSP. I don't think that's possible to get past without doing what you
> do.

Indeed, and the interrupt movement was IMO way more complex than what
I'm proposing (even with the followup patches that attempt to  silence
device interrupts).

> > Fixes: e2bb28d62158 ('x86/irq: forward pending interrupts to new 
> > destination in fixup_irqs()')
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.