|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
>>> On 29.05.15 at 10:19, <wei.w.wang@xxxxxxxxx> wrote:
> On 26/05/2015 21:58, Jan Beulich wrote
>> >>> On 13.05.16 at 09:50, <wei.w.wang@xxxxxxxxx> wrote:
>> > +static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
>> > +{
>> > + cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
>> > + policy->cpuinfo.max_freq);
>> > +
>> > + if ( policy->policy != CPUFREQ_POLICY_POWERSAVE &&
>> > + policy->policy != CPUFREQ_POLICY_PERFORMANCE &&
>> > + policy->policy != CPUFREQ_POLICY_USERSPACE &&
>> > + policy->policy != CPUFREQ_POLICY_ONDEMAND )
>>
>> switch()
>
> How would we use switch() here?
switch ( policy->policy )
{
case CPUFREQ_POLICY_POWERSAVE:
etc. But I thought that to be obvious, so I'm not sure I understand
what you don't understand.
>> > +static int intel_pstate_cpu_init(struct cpufreq_policy *policy) {
>> > + struct cpudata *cpu;
>> > + int rc;
>> > +
>> > + rc = intel_pstate_init_cpu(policy->cpu);
>>
>> Having both intel_pstate_cpu_init() and intel_pstate_init_cpu() is at least
>> odd, the more that the latter is being called from only here.
>
> Are you suggesting to change the function name? If so, how about changing
> intel_pstate_cpu_init() to intel_pstate_setup()?
Either that, or fold the called function into the caller.
>> > + if (rc)
>> > + return rc;
>> > +
>> > + cpu = all_cpu_data[policy->cpu];
>> > + if (limits.min_perf_pct == 100 && limits.max_perf_pct == 100)
>> > + policy->policy = CPUFREQ_POLICY_PERFORMANCE;
>> > + else
>> > + policy->policy = CPUFREQ_POLICY_ONDEMAND;
>> > +
>> > + policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling;
>> > + policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling;
>> > + policy->min_perf_pct = 0;
>> > + policy->max_perf_pct = 100;
>> > + policy->turbo_pct = get_turbo_pct();
>> > +
>> > + /* cpuinfo and default policy values */
>> > + policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu-
>> >pstate.scaling;
>> > + policy->cpuinfo.max_freq =
>> > + cpu->pstate.turbo_pstate * cpu->pstate.scaling;
>> > + policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
>> > + cpumask_set_cpu(policy->cpu, policy->cpus);
>> > +
>> > + /* the first cpu do the init work for limits.min/max_policy_pct */
>> > + if (cpu->cpu == 0) {
>>
>> cpu->cpu == 0 doesn't mean this is the first CPU to come here.
>
> How about this one:
>
> If (limits.min_policy_pct == 0) {
> limits.min_policy_pct = ....
> limits.xx = ....
> }
>
> limits.min_policy_pct won't be 0 if it is initialized.
If that's guaranteed - fine with me.
>> > +static int intel_pstate_msrs_not_valid(void) {
>> > + /* Check that all the msr's we are using are valid. */
>> > + u64 aperf, mperf, tmp;
>> > +
>> > + rdmsrl(MSR_IA32_APERF, aperf);
>> > + rdmsrl(MSR_IA32_MPERF, mperf);
>> > +
>> > + if (!pstate_funcs.get_max() ||
>> > + !pstate_funcs.get_min() ||
>> > + !pstate_funcs.get_turbo())
>> > + return -ENODEV;
>> > +
>> > + rdmsrl(MSR_IA32_APERF, tmp);
>> > + if (!(tmp - aperf))
>>
>> Why not "if(tmp == aperf)"? And - is it guaranteed that APERF (and MPERF
>> below) incremented in the meantime?
>
> Will change it to "if (tmp == aperf)".
> According to the SDM, IA32_MPERF MSR (E7H) increments in proportion to a
> fixed frequency, and IA32_APERF MSR increments in proportion to actual
> performance. So, as long as the two MSRs are valid, their values will be
> increased.
The "in proportion" is what makes me nervous: What if the proportion
is 1 in every 1000 cycles?
>> > +int __init intel_pstate_init(void)
>> > +{
>> > + int cpu, rc = 0;
>> > + const struct x86_cpu_id *id;
>> > + struct cpu_defaults *cpu_info;
>> > +
>> > + if (cpuid_ecx(6) & 0x1)
>> > + set_bit(X86_FEATURE_APERFMPERF,
>> > + &boot_cpu_data.x86_capability);
>>
>> This should be consolidated out of the other cpufreq drivers into
>> cpu/common.c.
>
> How about moving it to the bottom of init_intel() in cpu/intel.c ?
It's not Intel specific, so it belongs in cpu/common.c.
>> > --- a/xen/include/asm-x86/acpi.h
>> > +++ b/xen/include/asm-x86/acpi.h
>> > @@ -32,6 +32,8 @@
>> > #define COMPILER_DEPENDENT_INT64 long long
>> > #define COMPILER_DEPENDENT_UINT64 unsigned long long
>> >
>> > +extern int intel_pstate_init(void);
>>
>> Why in acpi.h?
>
> I was thinking that p-state is part of ACPI. Do you prefer to create a new .h
> called cpufreq.h, just like the cpuidle.h there?
ACPI is a way to convey information about P- (and C- and T-) states,
but the latter are imo not tied to ACPI. In fact your patch series here
proves the opposite: You add code dealing with P-states without
using information coming from ACPI. I think this should go into what
currently is acpi/cpufreq/cpufreq.h, which is expected to be moved
out of acpi/ anyway (I seem to even recall an ARM side series aiming
at doing that).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |