|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/intel: workaround several MONITOR/MWAIT errata
On 23.04.2025 10:45, Roger Pau Monné wrote:
> On Tue, Apr 22, 2025 at 10:26:37AM +0200, Jan Beulich wrote:
>> On 17.04.2025 18:19, Roger Pau Monne wrote:
>>> @@ -380,6 +380,43 @@ static void probe_c3_errata(const struct cpuinfo_x86
>>> *c)
>>> }
>>> }
>>>
>>> +/*
>>> + * APL30: One use of the MONITOR/MWAIT instruction pair is to allow a
>>> logical
>>> + * processor to wait in a sleep state until a store to the armed address
>>> range
>>> + * occurs. Due to this erratum, stores to the armed address range may not
>>> + * trigger MWAIT to resume execution.
>>> + *
>>> + * ICX143: Under complex microarchitectural conditions, a monitor that is
>>> armed
>>> + * with the MWAIT instruction may not be triggered, leading to a processor
>>> + * hang.
>>> + *
>>> + * LNL030: Problem P-cores may not exit power state Core C6 on monitor hit.
>>
>> I didn't manage to spot all three spec updates; none of these have a ucode
>> fix,
>> hence permitting the workaround to be avoided?
>>
>> Since CPX is 3rd Gen Xeon Scalable just like ICX is, I'm surprised that one's
>> unaffected. The most recent spec update there is a year old than ICX'es, so
>> may simply be too old to include the erratum?
>>
>> Sunny Cove is used by further Icelake models - they're known to be
>> unaffected?
>
> FWIW, client IceLake also seems to be unaffected, so I don't really
> know. So far I've only found this issue in the ICX spec update. The
> fix for Linux limits this further to model 106 only.
>
>>> + * Force the sending of an IPI in those cases.
>>> + */
>>> +static void probe_mwait_errata(void)
>>> +{
>>> + static const struct x86_cpu_id models[] = {
>>> + /* Apollo Lake */
>>> + INTEL_FAM6_MODEL(0x5C),
>>> + /* Ice Lake */
>>> + INTEL_FAM6_MODEL(0x6A),
>>> + INTEL_FAM6_MODEL(0x6C),
>>> + /* Lunar Lake */
>>> + INTEL_FAM6_MODEL(0xBD),
>>
>> Use identifiers from intel-family.h here?
>
> Sure.
>
>>> + { }
>>> + };
>>> +#undef INTEL_FAM6_MODEL
>>> +
>>> + if ( boot_cpu_has(X86_FEATURE_MONITOR) && !force_mwait_ipi_wakeup &&
>>> + x86_match_cpu(models) )
>>> + {
>>> + printk(XENLOG_WARNING
>>> + "Forcing IPI MWAIT wakeup due to CPU erratum\n");
>>> + force_mwait_ipi_wakeup = true;
>>> + }
>>> +}
>>
>> Do we really need to cater for asymmetric systems? IOW can't we do this once
>> on the BSP? Otherwise - why the use of boot_cpu_has() here? Oh, wait ...
>>
>>> @@ -406,6 +443,8 @@ static void Intel_errata_workarounds(struct cpuinfo_x86
>>> *c)
>>> __set_bit(X86_FEATURE_CLFLUSH_MONITOR, c->x86_capability);
>>>
>>> probe_c3_errata(c);
>>> + if (c == &boot_cpu_data)
>>> + probe_mwait_errata();
>>> }
>>
>> ..., you do this for the BSP only. Then why's the function not __init and
>> the global variable not __ro_after_init (and models[] __initconst)?
>>
>> (Later) Except that this path is also taken for S3 resume, from
>> recheck_cpu_features(). This shouldn't alter the variable value anymore,
>> though. A disagreement ought to result in recheck_cpu_features() to
>> report failure. (Imo perhaps better to avoid the call above during resume.)
>
> I did consider to limit the call based on system_state, wasn't sure
> whether that would be more churn than help.
The fundamental expectation is that such aspects don't changes across
S3. I'd suggest to add the extra check, but I wouldn't insist; what I'd
like to see is the variable becoming __ro_after_init, though (whichever
way this is being arranged for).
> LNL030 has a reference to: "It may be possible for the BIOS to contain
> a workaround for this erratum." so wasn't fully sure we wouldn't need
> to check for this in all cores if there's some firmware fix for it
> that Xen could identify.
Usually this means a ucode update (occasionally it may also be a chipset
config that firmware can fiddle with), in which case we wouldn't
normally (i.e. unless in critical situations) add a workaround at all.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |