|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 10/18] xen/cpufreq: introduce a new amd cppc driver for cpufreq scaling
On 04.07.2025 09:23, Penny, Zheng wrote:
> [Public]
>
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: Friday, July 4, 2025 2:21 PM
>> To: Penny, Zheng <penny.zheng@xxxxxxx>
>> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper
>> <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen-
>> devel@xxxxxxxxxxxxxxxxxxxx
>> Subject: Re: [PATCH v5 10/18] xen/cpufreq: introduce a new amd cppc driver
>> for
>> cpufreq scaling
>>
>> On 04.07.2025 05:40, Penny, Zheng wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>> Sent: Wednesday, July 2, 2025 6:48 PM
>>>>
>>>> On 02.07.2025 11:49, Penny, Zheng wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>>>> Sent: Tuesday, June 17, 2025 12:00 AM
>>>>>> To: Penny, Zheng <penny.zheng@xxxxxxx>
>>>>>>
>>>>>> On 27.05.2025 10:48, Penny Zheng wrote:
>>>>>>> +static int cf_check amd_cppc_cpufreq_target(struct cpufreq_policy
>>>>>>> *policy,
>>>>>>> + unsigned int target_freq,
>>>>>>> + unsigned int relation) {
>>>>>>> + unsigned int cpu = policy->cpu;
>>>>>>> + const struct amd_cppc_drv_data *data =
>>>>>>> +per_cpu(amd_cppc_drv_data,
>>>> cpu);
>>>>>>> + uint8_t des_perf;
>>>>>>> + int res;
>>>>>>> +
>>>>>>> + if ( unlikely(!target_freq) )
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + res = amd_cppc_khz_to_perf(data, target_freq, &des_perf);
>>>>>>> + if ( res )
>>>>>>> + return res;
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Setting with "lowest_nonlinear_perf" to ensure governoring
>>>>>>> + * performance in P-state range.
>>>>>>> + */
>>>>>>> + amd_cppc_write_request(policy->cpu, data-
>>> caps.lowest_nonlinear_perf,
>>>>>>> + des_perf, data->caps.highest_perf);
>>>>>>
>>>>>> I fear I don't understand the comment, and hence it remains unclear
>>>>>> to me why lowest_nonlinear_perf is being used here.
>>>>>
>>>>> How about
>>>>> ```
>>>>> Choose lowest nonlinear performance as the minimum performance level
>>>>> at which
>>>> the platform may run.
>>>>> Lowest nonlinear performance is the lowest performance level at
>>>>> which nonlinear power savings are achieved, Above this threshold,
>>>>> lower performance
>>>> levels should be generally more energy efficient than higher performance
>>>> levels.
>>>>> ```
>>>>
>>>> I finally had to go to the ACPI spec to understand what this is
>>>> about. There looks to be an implication that lowest <=
>>>> lowest_nonlinear, and states in that range would correspond more to
>>>> T-states than to P-states. With that I think I agree with the use
>>>
>>> Yes, It doesn't have definitive conclusion about relation between
>>> lowest and lowest_nonlinear In our internal FW designed spec, it
>>> always shows lowest_nonlinear corresponds to P2
>>>
>>>> of lowest_nonlinear here. The comment, however, could do with moving
>>>> farther away from merely quoting the pretty abstract text in the ACPI
>>>> spec, as such quoting doesn't help in clarifying terminology used,
>>>> when that terminology also isn't explained anywhere else in the code base.
>>>
>>>
>>> How about we add detailed explanations about each terminology in the
>>> beginning declaration , see:
>>> ```
>>> +/*
>>> + * Field highest_perf, nominal_perf, lowest_nonlinear_perf, and
>>> +lowest_perf
>>> + * contain the values read from CPPC capability MSR.
>>> + * Field highest_perf represents highest performance, which is the
>>> +absolute
>>> + * maximum performance an individual processor may reach, assuming
>>> +ideal
>>> + * conditions
>>> + * Field nominal_perf represents maximum sustained performance level
>>> +of the
>>> + * processor, assuming ideal operating conditions.
>>> + * Field lowest_nonlinear_perf represents Lowest Nonlinear
>>> +Performance, which
>>> + * is the lowest performance level at which nonlinear power savings
>>> +are
>>> + * achieved. Above this threshold, lower performance levels should be
>>> + * generally more energy efficient than higher performance levels.
>>
>> Which is still only the vague statement also found in the spec. This says
>> nothing
>> about what happens below that level, or what the relationship to other
>> fields is.
>>
>>> + * Field lowest_perf represents the absolute lowest performance level
>>> +of the
>>> + * platform.
>>> + *
>>> + * Field max_perf, min_perf, des_perf store the values for CPPC request
>>> MSR.
>>> + * Field max_perf conveys the maximum performance level at which the
>>> +platform
>>> + * may run. And it may be set to any performance value in the range
>>> + * [lowest_perf, highest_perf], inclusive.
>>> + * Field min_perf conveys the minimum performance level at which the
>>> +platform
>>> + * may run. And it may be set to any performance value in the range
>>> + * [lowest_perf, highest_perf], inclusive but must be less than or
>>> +equal to
>>> + * max_perf.
>>> + * Field des_perf conveys performance level Xen is requesting. And it
>>> +may be
>>> + * set to any performance value in the range [min_perf, max_perf],
>>> inclusive.
>>> + */
>>> +struct amd_cppc_drv_data
>>> +{
>>> + const struct xen_processor_cppc *cppc_data;
>>> + union {
>>> + uint64_t raw;
>>> + struct {
>>> + unsigned int lowest_perf:8;
>>> + unsigned int lowest_nonlinear_perf:8;
>>> + unsigned int nominal_perf:8;
>>> + unsigned int highest_perf:8;
>>> + unsigned int :32;
>>> + };
>>> + } caps;
>>> + union {
>>> + uint64_t raw;
>>> + struct {
>>> + unsigned int max_perf:8;
>>> + unsigned int min_perf:8;
>>> + unsigned int des_perf:8;
>>> + unsigned int epp:8;
>>> + unsigned int :32;
>>> + };
>>> + } req;
>>> +
>>> + int err;
>>> +};
>>> ``
>>> Then here, we could elaborate the reason why we choose lowest_nonlinear_perf
>> over lowest_perf:
>>> ```
>>> + /*
>>> + * Having a performance level lower than the lowest nonlinear
>>> + * performance level, such as, lowest_perf <= perf <=
>>> lowest_nonliner_perf,
>>> + * may actually cause an efficiency penalty, So when deciding the
>>> min_perf
>>> + * value, we prefer lowest nonlinear performance over lowest
>>> performance
>>> + */
>>> + amd_cppc_write_request(policy->cpu, data->caps.lowest_nonlinear_perf,
>>> + des_perf, data->caps.highest_perf);
>>> ```
>>
>> This reads fine to me.
>>
>> Question then is though: Is setting lowest_perf as the low boundary a good
>> idea in
>> any of the places? (Iirc it is used in one or two places. Or am I
>> misremembering?)
>
> Yes, in active mode, I choose lowest_perf as min_perf to try to extend the
> limitation for active(autonomous) mode
> Maybe it is not a good choice. Maybe cpufreq driver is limited to do
> performance tuning in P-states range.
Indeed I think we should limit ourselves to P-state management; management of
T-states
was never introduced into Xen, so far. But please be sure to make the
connection to P-
and T-states in the commentary you add.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |