[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v2 05/11] xen/x86: introduce a new amd cppc driver for cpufreq scaling


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: "Penny, Zheng" <penny.zheng@xxxxxxx>
  • Date: Tue, 18 Feb 2025 07:40:56 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QerWoJqWHl/KDNPJOz0zslJIIOAghwdEsf9jvDw9aQU=; b=bAA1KP5kNKHAxATbd87n94jTOXMgjOfJullty+WzILIMiMe/guIGlXwpXn29rfaj646//X0eq7HmnGFNiCXK0P146eWtqWz4//hC1C/99oL/zRoxqxRGvzkeQZOgUneIMLgXha42m3xAyKKR2dJw6iWJEYOn4GC2biE1OdFcTLB5RlyqC+VgQjAhjhM80716zRrH/g9Qn8bpuVrUimizY8IY/naD5iNwXDchHSQF6oDxNFfAOn3+0Ru6KWLVpe+8u6hCR0vgiQ2IDipdGXdko3wAmliSGZhQnk8wTJksv44PsRL/E40I/OmKVa7QrJXi8IOztMuew37CRhFlux7Hng==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=drg2gDGwvKRjiiEgY5jNW48hzjgzVlfJdtG5x4di5ZE2V4I3giowTJoAfLCj/YfDd7wV0xfq8p13bOxjYmPZQSE60/i+VutKgQ4GOYmKTozlcVHFn0m9xugCjIUTwtD2ZsAc5y+jhhoh8oyW5WJArA4QOGom57n5IuND+3/8eMcVSE0aWy7lhvtMva3UE3PYHc3gZIccUVDD/EE6sbGLUI1SnW6Hin+zJmEcXZzbGDqwUxYohYM2EzIpWkqM12riubV3F5n01iWKPEgBug9MDbwk16ePm+MiGTeaX4Fc4oilzCSqjB8wiaqNWC2yK2sn2CTwqZ+jrOiKjr6qHyHc8Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: "Huang, Ray" <Ray.Huang@xxxxxxx>, "Andryuk, Jason" <Jason.Andryuk@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 18 Feb 2025 07:41:29 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Msip_labels: MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_ActionId=13c6ff0c-6a65-470a-91d6-5c134621b6cc;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_ContentBits=0;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_Enabled=true;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_Method=Standard;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_Name=AMD Internal Distribution Only;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_SetDate=2025-02-18T06:32:26Z;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d;MSIP_Label_dce362fe-1558-4fb5-9f64-8a6240d76441_Tag=10, 3, 0, 1;
  • Thread-index: AQHbeHHP6s77OhtOyESwm9UCfT4CR7NCV/0AgApU3yA=
  • Thread-topic: [PATCH v2 05/11] xen/x86: introduce a new amd cppc driver for cpufreq scaling

[AMD Official Use Only - AMD Internal Distribution Only]

Hi

> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Wednesday, February 12, 2025 12:46 AM
> To: Penny, Zheng <penny.zheng@xxxxxxx>
> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andryuk, Jason
> <Jason.Andryuk@xxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>;
> Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 05/11] xen/x86: introduce a new amd cppc driver for
> cpufreq scaling
>
> On 06.02.2025 09:32, Penny Zheng wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> >
> > +/*
> > + * If CPPC lowest_freq and nominal_freq registers are exposed then we
> > +can
> > + * use them to convert perf to freq and vice versa. The conversion is
> > + * extrapolated as an affine function passing by the 2 points:
>
> Having studied maths, "affine function" isn't a term I know. There are affine
> transformations, but don't you simply mean "linear function" here? If so, is 
> it said
> anywhere in the spec that perf values grow linearly with freq ones?
>

Yes, "linear mapping" is better. And the spec reference is as follows:
```
The OS should use Lowest Frequency/Performance and Nominal Frequency/Performance
as anchor points to create a linear mapping of CPPC abstract performance to CPU 
frequency
```
See 8.4.6.1.1.7. Lowest Frequency and Nominal Frequency
 
(https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#lowest-frequency-and-nominal-frequency
 )

> > + *  - (Low perf, Low freq)
> > + *  - (Nominal perf, Nominal freq)
> > + */
> > +static int amd_cppc_khz_to_perf(const struct amd_cppc_drv_data *data,
> > +unsigned int freq, uint8_t *perf)
>
> Overlong line again. Please sort throughout the series.
>
> > +{
> > +    const struct xen_processor_cppc *cppc_data = data->cppc_data;
> > +    uint64_t mul, div, offset = 0, res;
> > +
> > +    if ( freq == (cppc_data->nominal_freq * 1000) )
>
> There's no comment anywhere what the units of the values are. Therefore the
> multiplication by 1000 here leaves me wondering why consistent units aren't 
> used in
> the first place. (From the name of the function I might guess that "freq" is 
> in kHz,
> and then perhaps ->{min,max,nominal}_freq are in MHz.
> Then for the foreseeable future we're hopefully safe here wrt overflow.)
>

These conversion functions are designed in the first place for *ondemand* 
governor, which
reports performance as CPU frequencies. In generic governor->target() 
functions, we are always
take freq in KHz, but in CPPC ACPI spec, the frequency is read in Mhz from 
register...

> > +    {
> > +        *perf = data->caps.nominal_perf;
> > +        return 0;
> > +    }
> > +
> > +    if ( freq == (cppc_data->lowest_freq * 1000) )
> > +    {
> > +        *perf = data->caps.lowest_perf;
> > +        return 0;
> > +    }
> > +
> > +    if ( (cppc_data->lowest_freq) && (cppc_data->nominal_freq) )
>
> Why the inner parentheses?
>
> > +    {
> > +        mul = data->caps.nominal_perf - data->caps.lowest_perf;
> > +        div = cppc_data->nominal_freq - cppc_data->lowest_freq;
> > +        /*
> > +         * We don't need to convert to kHz for computing offset and can
> > +         * directly use nominal_freq and lowest_freq as the division
> > +         * will remove the frequency unit.
> > +         */
> > +        div = div ?: 1;
> > +        offset = data->caps.nominal_perf - (mul *
> > + cppc_data->nominal_freq) / div;
>
> I fear I can't convince myself that the subtraction can't ever underflow.
> With
>
> O = offset
> Pn = data->caps.nominal_perf
> Pl = data->caps.lowest_perf
> Fn = cppc_data->nominal_freq
> Fl = cppc_data->lowest_freq
>
> the above becomes
>
> O = Pn - ((Pn - Pl) * Fn) / (Fn - Fl)
>
> and your assumption is O >= 0 (and for inputs: Fn >= Fl and Pn >= Pl). That 
> for me
> transforms to
>
> (Pn - Pl) * Fn <= Pn * (Fn - Fl)
>
> and further
>
> -(Pl * Fn) <= -(Pn * Fl)
>
> or
>
> Pn * Fl <= Pl * Fn
>
> and I don't see why this would always hold. Yet if there can be underflow, I 
> wonder
> whether the calculation is actually correct. Or, ...
>

Because we are assuming that in normal circumstances, when F==0, P is the 
offset value, and
It shall be an non-smaller-than-zero value, tbh, ==0 is more logical fwit
So if it is underflow, I might think the hardware itself is malfunctional.

> > +    }
> > +    else
> > +    {
> > +        /* Read Processor Max Speed(mhz) as anchor point */
> > +        mul = data->caps.highest_perf;
> > +        div = this_cpu(max_freq_mhz);
> > +        if ( !div )
> > +            return -EINVAL;
> > +    }
> > +
> > +    res = offset + (mul * freq) / (div * 1000);
>
> ... considering that a negative offset here isn't really an issue, as long as 
> the rhs of
> the addition is large enough, is offset perhaps meant to be a signed quantity 
> (and
> considering it's in principle an [abstract] perf value, it doesn't even need 
> to be a 64-
> bit one, i.e. perhaps one of the cases where plain int is appropriate to use)?
>
> > +    if ( res > UINT8_MAX )
> > +    {
> > +        printk(XENLOG_ERR "Perf value exceeds maximum value 255:
> > + %lu\n", res);
>
> If this was to ever trigger, it would then likely trigger often. Imo such log 
> messages
> want to be printk_once(), if they're needed at all.
>
> > +        return -EINVAL;
>
> Can't we instead clip to 0xff?
>

True, clip it to 0xff maybe better

>
> Jan

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.