Xen project Mailing List

Re: [PATCH] x86/intel: workaround several MONITOR/MWAIT errata

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Wed, 23 Apr 2025 10:45:24 +0200

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 23 Apr 2025 08:45:29 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Apr 22, 2025 at 10:26:37AM +0200, Jan Beulich wrote: > On 17.04.2025 18:19, Roger Pau Monne wrote: > > @@ -380,6 +380,43 @@ static void probe_c3_errata(const struct cpuinfo_x86 > > *c) > > } > > } > > > > +/* > > + * APL30: One use of the MONITOR/MWAIT instruction pair is to allow a > > logical > > + * processor to wait in a sleep state until a store to the armed address > > range > > + * occurs. Due to this erratum, stores to the armed address range may not > > + * trigger MWAIT to resume execution. > > + * > > + * ICX143: Under complex microarchitectural conditions, a monitor that is > > armed > > + * with the MWAIT instruction may not be triggered, leading to a processor > > + * hang. > > + * > > + * LNL030: Problem P-cores may not exit power state Core C6 on monitor hit. > > I didn't manage to spot all three spec updates; none of these have a ucode > fix, > hence permitting the workaround to be avoided? > > Since CPX is 3rd Gen Xeon Scalable just like ICX is, I'm surprised that one's > unaffected. The most recent spec update there is a year old than ICX'es, so > may simply be too old to include the erratum? > > Sunny Cove is used by further Icelake models - they're known to be unaffected? FWIW, client IceLake also seems to be unaffected, so I don't really know. So far I've only found this issue in the ICX spec update. The fix for Linux limits this further to model 106 only. > > + * Force the sending of an IPI in those cases. > > + */ > > +static void probe_mwait_errata(void) > > +{ > > + static const struct x86_cpu_id models[] = { > > + /* Apollo Lake */ > > + INTEL_FAM6_MODEL(0x5C), > > + /* Ice Lake */ > > + INTEL_FAM6_MODEL(0x6A), > > + INTEL_FAM6_MODEL(0x6C), > > + /* Lunar Lake */ > > + INTEL_FAM6_MODEL(0xBD), > > Use identifiers from intel-family.h here? Sure. > > + { } > > + }; > > +#undef INTEL_FAM6_MODEL > > + > > + if ( boot_cpu_has(X86_FEATURE_MONITOR) && !force_mwait_ipi_wakeup && > > + x86_match_cpu(models) ) > > + { > > + printk(XENLOG_WARNING > > + "Forcing IPI MWAIT wakeup due to CPU erratum\n"); > > + force_mwait_ipi_wakeup = true; > > + } > > +} > > Do we really need to cater for asymmetric systems? IOW can't we do this once > on the BSP? Otherwise - why the use of boot_cpu_has() here? Oh, wait ... > > > @@ -406,6 +443,8 @@ static void Intel_errata_workarounds(struct cpuinfo_x86 > > *c) > > __set_bit(X86_FEATURE_CLFLUSH_MONITOR, c->x86_capability); > > > > probe_c3_errata(c); > > + if (c == &boot_cpu_data) > > + probe_mwait_errata(); > > } > > ..., you do this for the BSP only. Then why's the function not __init and > the global variable not __ro_after_init (and models[] __initconst)? > > (Later) Except that this path is also taken for S3 resume, from > recheck_cpu_features(). This shouldn't alter the variable value anymore, > though. A disagreement ought to result in recheck_cpu_features() to > report failure. (Imo perhaps better to avoid the call above during resume.) I did consider to limit the call based on system_state, wasn't sure whether that would be more churn than help. LNL030 has a reference to: "It may be possible for the BIOS to contain a workaround for this erratum." so wasn't fully sure we wouldn't need to check for this in all cores if there's some firmware fix for it that Xen could identify. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.