|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v5 10/18] xen/cpufreq: introduce a new amd cppc driver for cpufreq scaling
[Public]
> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Wednesday, July 2, 2025 6:48 PM
> To: Penny, Zheng <penny.zheng@xxxxxxx>
> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v5 10/18] xen/cpufreq: introduce a new amd cppc driver for
> cpufreq scaling
>
> On 02.07.2025 11:49, Penny, Zheng wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@xxxxxxxx>
> >> Sent: Tuesday, June 17, 2025 12:00 AM
> >> To: Penny, Zheng <penny.zheng@xxxxxxx>
> >>
> >> On 27.05.2025 10:48, Penny Zheng wrote:
> >>> --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> >>> +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> >>> + /*
> >>> + * We don't need to convert to kHz for computing offset and can
> >>> + * directly use nominal_mhz and lowest_mhz as the division
> >>> + * will remove the frequency unit.
> >>> + */
> >>> + offset = data->caps.nominal_perf -
> >>> + (mul * cppc_data->cpc.nominal_mhz) / div;
> >>> + }
> >>> + else
> >>> + {
> >>> + /* Read Processor Max Speed(MHz) as anchor point */
> >>> + mul = data->caps.highest_perf;
> >>> + div = this_cpu(pxfreq_mhz);
> >>> + if ( !div )
> >>> + return -EINVAL;
> >>
> >> What's wrong about the function arguments in this case? (Same
> >> question again on further uses of EINVAL below.)
> >
> > If we could not get processor max frequency, the whole function is
> > useless...
> > Maybe -EOPNOTSUPP is more suitable than -EINVAL;
>
> I don't like EOPNOTSUPP very much either for the purpose, but it's surely
> better
> than EINVAL.
>
> >>> +static int cf_check amd_cppc_cpufreq_target(struct cpufreq_policy
> >>> *policy,
> >>> + unsigned int target_freq,
> >>> + unsigned int relation) {
> >>> + unsigned int cpu = policy->cpu;
> >>> + const struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data,
> cpu);
> >>> + uint8_t des_perf;
> >>> + int res;
> >>> +
> >>> + if ( unlikely(!target_freq) )
> >>> + return 0;
> >>> +
> >>> + res = amd_cppc_khz_to_perf(data, target_freq, &des_perf);
> >>> + if ( res )
> >>> + return res;
> >>> +
> >>> + /*
> >>> + * Setting with "lowest_nonlinear_perf" to ensure governoring
> >>> + * performance in P-state range.
> >>> + */
> >>> + amd_cppc_write_request(policy->cpu, data->caps.lowest_nonlinear_perf,
> >>> + des_perf, data->caps.highest_perf);
> >>
> >> I fear I don't understand the comment, and hence it remains unclear
> >> to me why lowest_nonlinear_perf is being used here.
> >
> > How about
> > ```
> > Choose lowest nonlinear performance as the minimum performance level at
> > which
> the platform may run.
> > Lowest nonlinear performance is the lowest performance level at which
> > nonlinear power savings are achieved, Above this threshold, lower
> > performance
> levels should be generally more energy efficient than higher performance
> levels.
> > ```
>
> I finally had to go to the ACPI spec to understand what this is about. There
> looks to
> be an implication that lowest <= lowest_nonlinear, and states in that range
> would
> correspond more to T-states than to P-states. With that I think I agree with
> the use
Yes, It doesn't have definitive conclusion about relation between lowest and
lowest_nonlinear
In our internal FW designed spec, it always shows lowest_nonlinear corresponds
to P2
> of lowest_nonlinear here. The comment, however, could do with moving farther
> away from merely quoting the pretty abstract text in the ACPI spec, as such
> quoting doesn't help in clarifying terminology used, when that terminology
> also isn't
> explained anywhere else in the code base.
How about we add detailed explanations about each terminology in the beginning
declaration , see:
```
+/*
+ * Field highest_perf, nominal_perf, lowest_nonlinear_perf, and lowest_perf
+ * contain the values read from CPPC capability MSR.
+ * Field highest_perf represents highest performance, which is the absolute
+ * maximum performance an individual processor may reach, assuming ideal
+ * conditions
+ * Field nominal_perf represents maximum sustained performance level of the
+ * processor, assuming ideal operating conditions.
+ * Field lowest_nonlinear_perf represents Lowest Nonlinear Performance, which
+ * is the lowest performance level at which nonlinear power savings are
+ * achieved. Above this threshold, lower performance levels should be
+ * generally more energy efficient than higher performance levels.
+ * Field lowest_perf represents the absolute lowest performance level of the
+ * platform.
+ *
+ * Field max_perf, min_perf, des_perf store the values for CPPC request MSR.
+ * Field max_perf conveys the maximum performance level at which the platform
+ * may run. And it may be set to any performance value in the range
+ * [lowest_perf, highest_perf], inclusive.
+ * Field min_perf conveys the minimum performance level at which the platform
+ * may run. And it may be set to any performance value in the range
+ * [lowest_perf, highest_perf], inclusive but must be less than or equal to
+ * max_perf.
+ * Field des_perf conveys performance level Xen is requesting. And it may be
+ * set to any performance value in the range [min_perf, max_perf], inclusive.
+ */
+struct amd_cppc_drv_data
+{
+ const struct xen_processor_cppc *cppc_data;
+ union {
+ uint64_t raw;
+ struct {
+ unsigned int lowest_perf:8;
+ unsigned int lowest_nonlinear_perf:8;
+ unsigned int nominal_perf:8;
+ unsigned int highest_perf:8;
+ unsigned int :32;
+ };
+ } caps;
+ union {
+ uint64_t raw;
+ struct {
+ unsigned int max_perf:8;
+ unsigned int min_perf:8;
+ unsigned int des_perf:8;
+ unsigned int epp:8;
+ unsigned int :32;
+ };
+ } req;
+
+ int err;
+};
``
Then here, we could elaborate the reason why we choose lowest_nonlinear_perf
over lowest_perf:
```
+ /*
+ * Having a performance level lower than the lowest nonlinear
+ * performance level, such as, lowest_perf <= perf <= lowest_nonliner_perf,
+ * may actually cause an efficiency penalty, So when deciding the min_perf
+ * value, we prefer lowest nonlinear performance over lowest performance
+ */
+ amd_cppc_write_request(policy->cpu, data->caps.lowest_nonlinear_perf,
+ des_perf, data->caps.highest_perf);
```
>
> Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |