[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.20 v3 0/5] xen/x86: prevent local APIC errors at shutdown
On 2/11/25 7:39 PM, Roger Pau Monné
wrote:
On Tue, Feb 11, 2025 at 12:02:04PM +0100, Roger Pau Monne wrote:Hello, The following series aims to prevent local APIC errors from stalling the shtudown process. On XenServer testing we have seen reports of AMD boxes sporadically getting stuck in a spam of: APIC error on CPU0: 00(08), Receive accept error Messages during shutdown, as a result of device interrupts targeting CPUs that are offline (and have the local APIC disabled). First patch strictly solves the issue of shutdown getting stuck, further patches aim to quiesce interrupts from all devices (known by Xen) as an attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also make kexec more reliable. Thanks, Roger. Roger Pau Monne (5): x86/shutdown: offline APs with interrupts disabled on all CPUs x86/irq: drop fixup_irqs() parameters x86/smp: perform disabling on interrupts ahead of AP shutdown x86/pci: disable MSI(-X) on all devices at shutdown x86/iommu: disable interrupts at shutdownThis is now fully reviewed, can I get your opinion (and release-acked-by) on which patches we should take for 4.20? If my understanding is correct to unblock shutdown process, it is enough just to have only first patch merged, correct? So the first patch should be merged. As second patch doesn't have functional changes, IMO, it could be merged to despite of the fact we have Hard code freeze period. All other patches, I would like to ask additional opinion (as I am an expert in x86), at first glance it looks like an absence of these patches in staging branch will lead only to triggering "Receive accept error" which I believe won't block shutdown process, so these patches could be postponed until 4.21. On other side, if it is low-risk fixes then we could consider to merge them now. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |