[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HWP and ACPI workarounds



On Wed, Feb 15, 2023 at 4:50 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 14.02.2023 20:04, Jason Andryuk wrote:
> > Qubes recently incorporated my HWP patches, but there was a report of
> > a laptop, Thinkpad X1 Carbon Gen 4 with a Skylake processor, locking
> > up during boot when HWP is enabled.  A user found a kernel bug that
> > seems to be the same issue:
> > https://bugzilla.kernel.org/show_bug.cgi?id=110941.
> >
> > That bug was fixed by Linux commit a21211672c9a ("ACPI / processor:
> > Request native thermal interrupt handling via _OSC").  The commit
> > message has a good summary of the issue and is included at the end of
> > this message.  The tl;dr is SMM crashes when it receives thermal
> > interrupts, so Linux calls the ACPI _OSC method to take over interrupt
> > handling.
> >
> > Today, Linux calls the _OSC method when boot_cpu_has(X86_FEATURE_HWP),
> > but that is not exposed to the PV Dom0.  As a test, the Qubes user was
> > able to boot with the check expanded to `boot_cpu_has(X86_FEATURE_HWP)
> > || xen_initial_domain()`.
> >
> > We need some way for Xen to indicate the presence and/or use of HWP to
> > Dom0, and Dom0 needs to use that to call _OSC.
> >
> > My first idea is that Dom0 could query Xen's cpufreq driver.  However,
> > Xen exposes the cpufreq driver through the unstable sysctl ops, and
> > using an unstable hypercall seems wrong for the kernel.
> >
> > Can we add something to an existing hypercall - maybe platform_op?  Or
> > do we need a new stable hypercall?
> >
> > Linux will perform the _OSC calls unilaterally upon seeing FEATURE_HWP
> > and independent of actually using HWP via the intel_pstate driver.
> > However, not using HWP may be an untested configuration in practice.
> > The intel_pstate.c driver will not use HWP when FEATURE_HWP_EPP is not
> > found.  So we could potentially cheat and expose only HWP to Dom0.
> > That should trigger the _OSC calls without letting Dom0 think it can
> > use HWP.  This is rather fragile though, so a more explicity method
> > seems better.
>
> I agree with the "fragile" aspect, but I'd also like to point out that
> no matter what features we expose in CPUID the driver should never try
> to take control when running under Xen (or perhaps more generally when
> running virtualized).

The intel_pstate driver doesn't have any check for running virtualized.

> > Roger's ACPI Processor patches that add xen_sanitize_pdc calls could
> > be leveraged.  On the Xen side, arch_acpi_set_pdc_bits() could be
> > extended to set bit 12, which would then be passed to the evaluate
> > _PDC call. _PDC is the older interface superseded by _OSC, but they
> > can be wrappers around the same implementation.  But if linux is just
> > using _OSC, it seems more compatible to follow that implementation.
>
> Using the _PDC bit would look quite reasonable to me. Yet what's
> unclear to me is whether by the last sentence you actually mean to
> indicate that you're not in favor of doing so (in which case more work
> in Xen would likely be needed to actually support enough of _OSC).

I was trying to make a statement about mimicking others' behaviour.  I
haven't tested using _PDC vs. _OSC.  The Intel ACPI Processor guidance
defines the bits the same for the two methods, so either should work.
It was more of a concern that while it "should" work, we don't know if
we'll hit corner cases.  Since Linux is running fine calling _OSC for
this purpose, we know it's been tested.  Having dom0 use _PDC is
untested.

I mention _PDC and _OSC being wrappers around the same implementation,
but that isn't necessarily true.  The Intel ACPI Processor guidance
defines the bits the same way for the two methods, so it should work.
But, again, it's untested.

In the ACPI Processor thread, Rafael J. Wysocki wrote: "Sorry for
joining late, but first off _PDC has been deprecated since ACPI 3.0
(2004) and it is not even present in ACPI 6.5 any more."  Maybe _PDC
is hanging around because operating systems are still using it.  But
that was another point making me question using _PDC.

> What you don't touch at all is how you mean to surface the LVT based
> interrupt to Dom0; the cited commit messages looks to describe uses
> beyond the HWP driver, and it uses that as part of the justification
> to override the firmware choice. The LAPIC is hidden (PV) or properly
> disconnected from the physical one (PVH), plus Xen's MCE code (however
> broken it may be) makes use of it. Or is the plan to ignore all of
> that (at least for now) and limit things to the HWP driver's needs?

I didn't intend to surface any interrupts, and I explicitly disable
the HWP interrupt in the Xen driver.   It is used by the processor to
indicate when certain values like the guaranteed performance change,
which isn't something I wanted to support.  But the thermal interrupt
is something else, and I haven't figured out how enabling HWP
triggered whatever HWP or thermal interrupt which caused the original
issue.  I'll look at this stuff some more.

After I sent my message, I was wondering about how Linux commit
a21211672c9a adds code to clear the HWP status on thermal interrupt
via an MSR write, which wouldn't work from dom0.  But if those
interrupts aren't making their way to dom0, that is fine and we can
just have Xen handle it.

Thanks for sharing your input.

-Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.