[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: RFC: disable HPET legacy mode after timer check
On 12.04.2023 13:25, Roger Pau Monné wrote: > On Tue, Apr 11, 2023 at 12:20:13PM +0100, Andrew Cooper wrote: >> On 11/04/2023 11:30 am, Simon Gaiser wrote: >>> Hi, >>> >>> I have been recently looking into getting S0ix working on Xen [1]. >>> >>> Thanks to a tip from Andrew I found that the HPET legacy mode was >>> preventing my test system from reaching a package C-state lower than PC7 >>> and thereby also preventing S0ix residency. >>> >>> For testing I simply modified check_timer() to disable it again after it >>> checked the timer irq: >>> >>> --- a/xen/arch/x86/io_apic.c >>> +++ b/xen/arch/x86/io_apic.c >>> @@ -1966,6 +1969,8 @@ static void __init check_timer(void) >>> >>> if ( timer_irq_works() ) >>> { >>> + hpet_disable_legacy_replacement_mode(); >>> local_irq_restore(flags); >>> return; >>> } >>> >>> >>> With this [2] I'm able to reach S0ix residency for some time and for short >>> periods the systems power consumption goes down to the same level as with >>> native Linux! >> >> Excellent progress! >> >>> It reaches low power states only for a fraction of the suspend to idle >>> time, so something still makes the CPU/chipset think it should leave the >>> low power mode, but that's another topic. >> >> Do you have any further info here? There are a range of possibilities, >> from excess timers in Xen (e.g. PV guests default to a 100Hz timer even >> though no guests actually want it AFAICT), or the 1s TSC rendezvous >> (which isn't actually needed on modern systems), all the way to the >> platform devices not entering d3hot. >> >>> >>> I tried to understand how all the timer code interacts with disabling >>> the legacy mode. I think it only would break cpuidle if X86_FEATURE_ARAT >>> is not available (Which is available on my test system and indeed I >>> didn't run into obvious breakage). >>> >>> Is this (disabled PIT && !ARAT) a configuration that exists (and needs >>> to be supported)? >>> >>> Did I miss something else? (Very much possible, given that this is way >>> above my existing experience with X86 and Xen internals.) >> >> Xen's code is a mess and needs an overhaul. >> >> Right now, we're using the timer as "a source of interrupts" to try and >> check that we've got things set up suitably. But this doesn't need to >> be the PIT, or a timer at all - it just needs to be "an interrupt coming >> in from the platform". > > I would even question whether that testing is useful overall. We test > a single IO-APIC pin, which still leaves room for the rest of them to > not be properly configured, and Xen might not be using the PIT timer at > the end. Testing one pin is sufficient for the intended purpose (proving that the delivery route platform -> IO-APIC -> LAPIC works), leaving aside firmware possibly configuring multiple IO-APICs inconsistently. Yet if there are multiple IO-APICs, I'm afraid we have no way of knowing how to trigger any of the pins of secondary ones. Even if we went to figure out what devices are connected to it, we'd then still have no (rudimentary) device drivers knowing how to interact with the devices. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |