[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors at shutdown



On Mon, Feb 10, 2025 at 07:29:35PM +0100, Oleksii Kurochko wrote:
> Hello Roger,
> 
> On 2/10/25 11:02 AM, Roger Pau Monné wrote:
> > Hello,
> > 
> > This should have had a 'for-4.20?' tag in the subject name, as
> > otherwise we will need to add an errata to the release notes to notice
> > that reboot can sometimes fail on AMD boxes.
> > 
> > Also adding Oleksii.
> > 
> > Thanks, Roger.
> > 
> > On Thu, Feb 06, 2025 at 04:06:10PM +0100, Roger Pau Monne wrote:
> > > Hello,
> > > 
> > > The following series aims to prevent local APIC errors from stalling the
> > > shtudown process.  On XenServer testing we have seen reports of AMD
> > > boxes sporadically getting stuck in a spam of:
> 
> How often this issue happen?

Hard to tell, we have certainly hit it more than once on XenRT, but
I don't have figures about its probability.  I have at least 14
reports from XenRT from the last 6 months, but there's possibly a lot
more that could have been classified as a different kind of issue.

> > > 
> > > APIC error on CPU0: 00(08), Receive accept error
> > > 
> > > Messages during shutdown, as a result of device interrupts targeting
> > > CPUs that are offline (and have the local APIC disabled).
> > > 
> > > First patch strictly solves the issue of shutdown getting stuck, further
> > > patches aim to quiesce interrupts from all devices (known by Xen) as an
> > > attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also
> > > make kexec more reliable.
> 
> If the first patch solves does it make sense to consider, at least, it to be 
> merged?

First one sure, the rest I think are also worth considering.  They get
rid of the resulting innocuous "APIC error on CPU0: 00(00)" message.
Also remaining patches are likely to provide the kexec kernel with a
better context, as they quiesce interrupts from devices.

I will send a new version soon, hopefully we can discuss over that one
which patches we want to pick.  With my XenServer hat on I plan to
backport the whole series into our patch queue.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.