[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 15/15] xen/xenpm: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
On 22.05.2025 07:59, Penny, Zheng wrote: > [Public] > >> -----Original Message----- >> From: Jan Beulich <jbeulich@xxxxxxxx> >> Sent: Wednesday, April 30, 2025 11:02 PM >> To: Penny, Zheng <penny.zheng@xxxxxxx> >> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper >> <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen- >> devel@xxxxxxxxxxxxxxxxxxxx >> Subject: Re: [PATCH v4 15/15] xen/xenpm: Adapt SET/GET_CPUFREQ_CPPC >> xen_sysctl_pm_op for amd-cppc driver >> >> On 14.04.2025 09:40, Penny Zheng wrote: >>> + /* Only allow values if params bit is set. */ >>> + if ( (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED) && >>> + set_cppc->desired) || >>> + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) && >>> + set_cppc->minimum) || >>> + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) && >>> + set_cppc->maximum) || >>> + (!(set_cppc->set_params & >> XEN_SYSCTL_CPPC_SET_ENERGY_PERF) && >>> + set_cppc->energy_perf) ) >>> + return -EINVAL; >>> + >>> + /* Activity window not supported in MSR */ >>> + if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ACT_WINDOW ) >>> + return -EOPNOTSUPP; >>> + >>> + /* Return if there is nothing to do. */ >>> + if ( set_cppc->set_params == 0 ) >>> + return 0; >>> + >>> + epp = per_cpu(epp_init, cpu); >>> + /* >>> + * Apply presets: >>> + * XEN_SYSCTL_CPPC_SET_DESIRED reflects whether desired perf is set, >> which >>> + * is also the flag to distiguish between passive mode and active mode. >>> + * When it is set, CPPC in passive mode, only >>> + * XEN_SYSCTL_CPPC_SET_PRESET_NONE could be chosen, where >> min_perf = >>> + * lowest_nonlinear_perf to ensures performance in P-state range. >>> + * when it is not set, CPPC in active mode, three different profile >>> + * >> XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE/PERFORMANCE/BALANC >> E are provided, >>> + * where min_perf = lowest_perf having T-state range of performance. >>> + */ >> >> I fear I'm struggling to parse some of this, making it difficult to suggest >> possible >> adjustments (as I can't derive what is meant to be said). Plus where's the >> term T- >> state coming from all of the sudden? >> > > Pasting description on "lowest_perf" and "nonlinear_lowest_perf": > " > Lowest Nonlinear Performance is the lowest performance level at which > nonlinear power savings are achieved, for example, due to the combined > effects of voltage and frequency scaling. Above this threshold, lower > performance levels should be generally more energy efficient than higher > performance levels. In traditional terms, this represents the P-state range > of performance levels. > > Lowest Performance is the absolute lowest performance level of the platform. > Selecting a performance level lower than the lowest nonlinear performance > level may actually cause an efficiency penalty, but should reduce the > instantaneous power consumption of the processor. In traditional terms, this > represents the T-state range of performance levels. > " And again "T-state". The only T-ish thing I'm aware of is something that was never implemented in Xen's power management. Hence I fear I continue to be confused. (And btw, I searched PM vol 2 for the term, and it doesn't occur there at all.) As a result ... > IMO, in passive mode, we rely on Xen governor to do performance tuning. And > Xen governor is thinking based on P-states. So I take non-linear lowest perf > as min_perf > While in active mode, hardware itself is calculating suitable > performance/frequency automatically based on workload, thermal, voltage, and > etc. So maybe setting min_perf with lowest perf is a better choice? ...answering this question isn't possible for me (yet). >>> + ret = get_amd_cppc_para(policy->cpu, >>> + &op->u.get_para.u.cppc_para); >>> + >>> + if ( strncmp(op->u.get_para.scaling_driver, XEN_HWP_DRIVER_NAME, >>> + CPUFREQ_NAME_LEN) && >>> + strncmp(op->u.get_para.scaling_driver, >> XEN_AMD_CPPC_EPP_DRIVER_NAME, >>> + CPUFREQ_NAME_LEN) ) >>> { >>> if ( !(scaling_available_governors = >>> xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) ) >> >> Isn't it the non-EPP driver which is governor-independent? >> > > EPP driver is governor-independent, indicating active mode. I'm confused here as well then: Early in the series, before the EPP driver is introduced (i.e. as preparation for the non-EPP one) you suppress use and reporting of the governor. Whereas for the EPP driver while you (also) don't use the governor as such, you use the choice of governor to determine the EPP setting. In that sense to me the EPP driver is less governor- independent than the non-EPP one. That's what I tried to express in my earlier reply. Jan > Hardware itself will automatically calculate and set frequency. User only > shall set min_perf, max_perf, and EPP at the beginning. > EPP is a preference ratio value towards performance over powersave, on the > scale of 0-255, 0 as 100% percentage favoring performance, while 255 as 100% > percentage favoring powersave > Non-EPP driver is still relying on Xen governor to do the tuning. > >> Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |