Xen project Mailing List

Re: [PATCH v3 07/14 RESEND] cpufreq: Export HWP parameters to userspace

From: Jason Andryuk <jandryuk@xxxxxxxxx>

Date: Thu, 11 May 2023 09:49:48 -0400

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 11 May 2023 13:50:12 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, May 11, 2023 at 2:21 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 10.05.2023 19:49, Jason Andryuk wrote: > > On Mon, May 8, 2023 at 6:26 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > >> > >> On 01.05.2023 21:30, Jason Andryuk wrote: > >>> Extend xen_get_cpufreq_para to return hwp parameters. These match the > >>> hardware rather closely. > >>> > >>> We need the features bitmask to indicated fields supported by the actual > >>> hardware. > >>> > >>> The use of uint8_t parameters matches the hardware size. uint32_t > >>> entries grows the sysctl_t past the build assertion in setup.c. The > >>> uint8_t ranges are supported across multiple generations, so hopefully > >>> they won't change. > >> > >> Still it feels a little odd for values to be this narrow. Aiui the > >> scaling_governor[] and scaling_{max,min}_freq fields aren't (really) > >> used by HWP. So you could widen the union in struct > >> xen_get_cpufreq_para (in a binary but not necessarily source compatible > >> manner), gaining you 6 more uint32_t slots. Possibly the somewhat oddly > >> placed scaling_cur_freq could be included as well ... > > > > The values are narrow, but they match the hardware. It works for HWP, > > so there is no need to change at this time AFAICT. > > > > Do you want me to make this change? > > Well, much depends on what these 8-bit values actually express (I did > raise this question in one of the replies to your patches, as I wasn't > able to find anything in the SDM). That'll then hopefully allow to > make some educated prediction on on how likely it is that a future > variant of hwp would want to widen them. Sorry for not providing a reference earlier. In the SDM, HARDWARE-CONTROLLED PERFORMANCE STATES (HWP) section, there is this second paragraph: """ In contrast, HWP is an implementation of the ACPI-defined Collaborative Processor Performance Control (CPPC), which specifies that the platform enumerates a continuous, abstract unit-less, performance value scale that is not tied to a specific performance state / frequency by definition. While the enumerated scale is roughly linear in terms of a delivered integer workload performance result, the OS is required to characterize the performance value range to comprehend the delivered performance for an applied workload. """ The numbers are "continuous, abstract unit-less, performance value." So there isn't much to go on there, but generally, smaller numbers mean slower and bigger numbers mean faster. Cross referencing the ACPI spec here: https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#collaborative-processor-performance-control Scrolling down you can find the register entries such as Highest Performance Register or DWORD Attribute: Read Size: 8-32 bits AMD has its own pstate implementation that is similar to HWP. Looking at the Linux support, the AMD hardware also use 8 bit values for the comparable fields: https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/msr-index.h#L612 So Intel and AMD are 8bit for now at least. Something could do 32bits according to the ACPI spec. 8 bits of granularity for slow to fast seems like plenty to me. I'm not sure what one would gain from 16 or 32 bits, but I'm not designing the hardware. From the earlier xenpm output, "highest" was 49, so still a decent amount of room in an 8 bit range. > (Was it energy_perf that went > from 4 to 8 bits at some point, which you even comment upon in the > public header?) energy_perf (Energy_Performanc_Preference) had a fallback: "If CPUID.06H:EAX[bit 10] indicates that this field is not supported, HWP uses the value of the IA32_ENERGY_PERF_BIAS MSR to determine the energy efficiency / performance preference." So it had a different range, but that was because it was being put into an older register. However, I've removed that fallback code in v4. Regards, Jason

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.