|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v4 09/15] xen/x86: introduce a new amd cppc driver for cpufreq scaling
[Public]
Hi,
> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Tuesday, April 29, 2025 10:29 PM
> To: Penny, Zheng <penny.zheng@xxxxxxx>
> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v4 09/15] xen/x86: introduce a new amd cppc driver for
> cpufreq scaling
>
> On 14.04.2025 09:40, Penny Zheng wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +/*
> > + * If CPPC lowest_freq and nominal_freq registers are exposed then we
> > +can
> > + * use them to convert perf to freq and vice versa. The conversion is
> > + * extrapolated as an linear function passing by the 2 points:
> > + * - (Low perf, Low freq)
> > + * - (Nominal perf, Nominal freq)
> > + */
> > +static int amd_cppc_khz_to_perf(const struct amd_cppc_drv_data *data,
> > + unsigned int freq, uint8_t *perf) {
> > + const struct xen_processor_cppc *cppc_data = data->cppc_data;
> > + uint64_t mul, div;
> > + int offset = 0, res;
> > +
> > + if ( freq == (cppc_data->cpc.nominal_mhz * 1000) )
>
> I'm pretty sure I commented on this before: The expression here _suggests_
> that
> "freq" is in kHz, but that's not being made explicit anywhere.
>
Sorry, I may overlook, and I'll be more careful.
I have clarified it in the function title, and maybe it's not enough. I'll
change the parameter
name from "freq" to "freq_khz" to be more explicit.
> > + {
> > + *perf = data->caps.nominal_perf;
> > + return 0;
> > + }
> > +
> > + if ( freq == (cppc_data->cpc.lowest_mhz * 1000) )
> > + {
> > + *perf = data->caps.lowest_perf;
> > + return 0;
> > + }
>
> How likely is it that these two early return paths are taken, when the
> incoming
> "freq" is 25 or 5 MHz granular? IOW - is it relevant to shortcut these two
> cases?
>
Sorry, I may not understand what you mean here.
Answering " How likely is it that these two early return paths are taken "
It's rare ig.... maybe *ondemand* governor will frequently give frequency
around nominal frequency,
but the exact value is rare ig.
I'm confused about " when the incoming "freq" is 25 or 5 MHz granular ".
Are we talking about is it worthy to have these two early return paths
considering it is rarely taken
> > + if ( cppc_data->cpc.lowest_mhz && cppc_data->cpc.nominal_mhz &&
> > + cppc_data->cpc.lowest_mhz < cppc_data->cpc.nominal_mhz )
>
> Along the lines of a comment on an earlier patch, the middle part of the
> condition
> here is redundant with the 3rd one. Also, don't you check this relation
> already
> during init? IOW isn't it the 3rd part which can be dropped?
>
Yes, you're right. I've checked it in set_cppc_pminfo() already and only gave
warnings there.
I shall delete the check here, and besides giving warning message during init,
if values are
invalid, instead of storing invalid values, we shall set
cppc_data->cpc.lowest_mhz / cppc_data->cpc.nominal_mhz them
zero... Then wherever we are trying to use them, like here, non-zero values
ensures valid values.
> > +static int amd_get_max_freq(const struct amd_cppc_drv_data *data,
> > + unsigned int *max_freq) {
> > + unsigned int nom_freq = 0, boost_ratio;
> > + int res;
> > +
> > + res = amd_get_lowest_or_nominal_freq(data,
> > + data->cppc_data->cpc.nominal_mhz,
> > + data->caps.nominal_perf,
> > + &nom_freq);
> > + if ( res )
> > + return res;
> > +
> > + boost_ratio = (unsigned int)(data->caps.highest_perf /
> > + data->caps.nominal_perf);
>
> I may have seen logic ensuring nominal_perf isn't 0, so that part may be
> fine. What
> guarantees this division to yield a positive value, though?
> If it yields zero (say 0xff / 0x80), ...
>
I think maybe you were saying 0x80/0xff to yield zero value. For that, we
checked that highest_perf
must not be smaller than nominal_perf during init, see ...
> > + *max_freq = nom_freq * boost_ratio;
>
> ... zero will be reported back here. I think you want to scale the
> calculations here to
> avoid such.
>
> > +static void cf_check amd_cppc_init_msrs(void *info) {
> > + struct cpufreq_policy *policy = info;
> > + struct amd_cppc_drv_data *data = this_cpu(amd_cppc_drv_data);
> > + uint64_t val;
> > + unsigned int min_freq = 0, nominal_freq = 0, max_freq;
> > +
> > + /* Package level MSR */
> > + rdmsrl(MSR_AMD_CPPC_ENABLE, val);
> > + /*
> > + * Only when Enable bit is on, the hardware will calculate the
> > processor’s
> > + * performance capabilities and initialize the performance level
> > fields in
> > + * the CPPC capability registers.
> > + */
> > + if ( !(val & AMD_CPPC_ENABLE) )
> > + {
> > + val |= AMD_CPPC_ENABLE;
> > + wrmsrl(MSR_AMD_CPPC_ENABLE, val);
> > + }
> > +
> > + rdmsrl(MSR_AMD_CPPC_CAP1, data->caps.raw);
> > +
> > + if ( data->caps.highest_perf == 0 || data->caps.lowest_perf == 0 ||
> > + data->caps.nominal_perf == 0 || data->caps.lowest_nonlinear_perf
> > == 0 ||
> > + data->caps.lowest_perf > data->caps.lowest_nonlinear_perf ||
>
> Same question as asked elsewhere - where is this relation specified?
>
> > + data->caps.lowest_nonlinear_perf > data->caps.nominal_perf ||
> > + data->caps.nominal_perf > data->caps.highest_perf )
here ...
>
> Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |