[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver



On 26/05/2015 21:58, Jan Beulich wrote
> >>> On 13.05.16 at 09:50, <wei.w.wang@xxxxxxxxx> wrote:
> > +static inline int ceiling_fp(int32_t x) {
> > +    int mask, ret;
> 
> Please here and below, consider whether types really need to be signed.
> One exception: If you intend to import the Linux source file with minimal
> modifications, and if that indeed results in only very few differences, then
> keeping the types as they are is probably fine. But in that case you should
> extend the patch description to say that the goal is to have minimal changes.
> All comments below are to be taken with the possible minimal change goal in
> mind.
> 
> > +static int byt_get_min_pstate(void)
> > +{
> > +    u64 value;
> > +
> > +    rdmsrl(BYT_RATIOS, value);
> > +    return (value >> 8) & 0x7F;
> > +}
> > +
> > +static int byt_get_max_pstate(void)
> > +{
> > +    u64 value;
> > +
> > +    rdmsrl(BYT_RATIOS, value);
> > +    return (value >> 16) & 0x7F;
> > +}
> > +
> > +static int byt_get_turbo_pstate(void) {
> > +    u64 value;
> > +
> > +    rdmsrl(BYT_TURBO_RATIOS, value);
> > +    return value & 0x7F;
> > +}
> > +
> > +static void byt_set_pstate(struct cpudata *cpudata, int pstate) {
> > +    u64 val;
> > +    int32_t vid_fp;
> > +    u32 vid;
> > +
> > +    val = pstate << 8;
> > +    if (limits.no_turbo && !limits.turbo_disabled)
> > +        val |= (u64)1 << 32;
> 
> All of the literal numbers above (and there are more further down) would
> better become self-documenting manifest constants.
> 
> > +#define BYT_BCLK_FREQS 5
> > +static int byt_freq_table[BYT_BCLK_FREQS] = { 833, 1000, 1333, 1167,
> > +800};
> 
> This can be both const and local to the only function it's being used by.
> 
> > +static struct cpu_defaults core_params = {
> 
> const? __initconst?
> 
> > +static struct cpu_defaults byt_params = {
> 
> Same here.
> 
> > +static void intel_pstate_timer_func(void *__data) {
> > +    struct cpudata *cpu = (struct cpudata *) __data;
> 
> Double underscores are completely unnecessary here.
> 
> > +#define ICPU(model, policy) \
> > +    { X86_VENDOR_INTEL, 6, model, X86_FEATURE_APERFMPERF,\
> > +            (unsigned long)&policy }
> > +
> > +static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
> 
> __initconst
> 
> > +    ICPU(0x2a, core_params),
> > +    ICPU(0x2d, core_params),
> > +    ICPU(0x37, byt_params),
> > +    ICPU(0x3a, core_params),
> > +    ICPU(0x3c, core_params),
> > +    ICPU(0x3d, core_params),
> > +    ICPU(0x3e, core_params),
> > +    ICPU(0x3f, core_params),
> > +    ICPU(0x45, core_params),
> > +    ICPU(0x46, core_params),
> > +    ICPU(0x47, core_params),
> > +    ICPU(0x4c, byt_params),
> > +    ICPU(0x4e, core_params),
> > +    ICPU(0x4f, core_params),
> > +    ICPU(0x56, core_params),
> 
> Please handle the _params name tag inside ICPU().
> 
> > +static int intel_pstate_set_policy(struct cpufreq_policy *policy) {
> > +    if (!policy->cpuinfo.max_freq)
> > +        return -ENODEV;
> > +
> > +    if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
> 
> switch(policy->policy) please.
> 
> > +        limits.no_turbo = 0;
> > +        limits.max_perf_pct = 100;
> > +        limits.max_perf = int_tofp(1);
> > +        limits.min_perf_pct = 100;
> > +        limits.min_perf = int_tofp(1);
> > +        policy->max_perf_pct = 100;
> > +        policy->min_perf_pct = 100;
> > +        return 0;
> 
> And no need for identical return statement in all four branches.
> 
> > +    } else if (policy->policy == CPUFREQ_POLICY_USERSPACE) {
> > +        limits.max_perf_pct = max(limits.min_policy_pct, policy-
> >max_perf_pct);
> > +        limits.max_perf_pct = min(limits.max_policy_pct,
> > + limits.max_perf_pct);
> 
> Why are you not using clamp() here?
> 
> > +    } else {
> 
> This should at least have a comment saying CPUFREQ_POLICY_ONDEMAND.
> 
> > +static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
> > +{
> > +    cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
> > +                                     policy->cpuinfo.max_freq);
> > +
> > +    if ( policy->policy != CPUFREQ_POLICY_POWERSAVE   &&
> > +         policy->policy != CPUFREQ_POLICY_PERFORMANCE &&
> > +         policy->policy != CPUFREQ_POLICY_USERSPACE   &&
> > +         policy->policy != CPUFREQ_POLICY_ONDEMAND )
> 
> switch()

How would we use switch() here? 

> 
> > +static int intel_pstate_cpu_init(struct cpufreq_policy *policy) {
> > +    struct cpudata *cpu;
> > +    int rc;
> > +
> > +    rc = intel_pstate_init_cpu(policy->cpu);
> 
> Having both intel_pstate_cpu_init() and intel_pstate_init_cpu() is at least
> odd, the more that the latter is being called from only here.

Are you suggesting to change the function name? If so, how about changing 
intel_pstate_cpu_init() to intel_pstate_setup()?

> 
> > +    if (rc)
> > +        return rc;
> > +
> > +    cpu = all_cpu_data[policy->cpu];
> > +    if (limits.min_perf_pct == 100 && limits.max_perf_pct == 100)
> > +        policy->policy = CPUFREQ_POLICY_PERFORMANCE;
> > +    else
> > +        policy->policy = CPUFREQ_POLICY_ONDEMAND;
> > +
> > +    policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling;
> > +    policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling;
> > +    policy->min_perf_pct = 0;
> > +    policy->max_perf_pct = 100;
> > +    policy->turbo_pct = get_turbo_pct();
> > +
> > +    /* cpuinfo and default policy values */
> > +    policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu-
> >pstate.scaling;
> > +    policy->cpuinfo.max_freq =
> > +        cpu->pstate.turbo_pstate * cpu->pstate.scaling;
> > +    policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
> > +    cpumask_set_cpu(policy->cpu, policy->cpus);
> > +
> > +    /* the first cpu do the init work for limits.min/max_policy_pct */
> > +    if (cpu->cpu == 0) {
> 
> cpu->cpu == 0 doesn't mean this is the first CPU to come here.

How about this one:

If (limits.min_policy_pct == 0) {
        limits.min_policy_pct = ....
        limits.xx = ....
}

limits.min_policy_pct won't be 0 if it is initialized.


> > +static int intel_pstate_msrs_not_valid(void) {
> > +    /* Check that all the msr's we are using are valid. */
> > +    u64 aperf, mperf, tmp;
> > +
> > +    rdmsrl(MSR_IA32_APERF, aperf);
> > +    rdmsrl(MSR_IA32_MPERF, mperf);
> > +
> > +    if (!pstate_funcs.get_max() ||
> > +        !pstate_funcs.get_min() ||
> > +        !pstate_funcs.get_turbo())
> > +        return -ENODEV;
> > +
> > +    rdmsrl(MSR_IA32_APERF, tmp);
> > +    if (!(tmp - aperf))
> 
> Why not "if(tmp == aperf)"? And - is it guaranteed that APERF (and MPERF
> below) incremented in the meantime?

Will change it to "if (tmp == aperf)".
According to the SDM, IA32_MPERF MSR (E7H) increments in proportion to a fixed 
frequency, and IA32_APERF MSR increments in proportion to actual performance. 
So, as long as the two MSRs are valid, their values will be increased.


> 
> > +static void copy_pid_params(struct pstate_adjust_policy *policy)
> 
> __init
> 
> > +static void copy_cpu_funcs(struct pstate_funcs *funcs)
> 
> Again.
> 
> > +int __init intel_pstate_init(void)
> > +{
> > +    int cpu, rc = 0;
> > +    const struct x86_cpu_id *id;
> > +    struct cpu_defaults *cpu_info;
> > +
> > +    if (cpuid_ecx(6) & 0x1)
> > +        set_bit(X86_FEATURE_APERFMPERF,
> > + &boot_cpu_data.x86_capability);
> 
> This should be consolidated out of the other cpufreq drivers into
> cpu/common.c.

How about moving it to the bottom of init_intel() in cpu/intel.c ?

> 
> > +
> > +    id = x86_match_cpu(intel_pstate_cpu_ids);
> > +    if (!id)
> > +        return -ENODEV;
> > +
> > +    cpu_info = (struct cpu_defaults *)id->driver_data;
> > +
> > +    copy_pid_params(&cpu_info->pid_policy);
> > +    copy_cpu_funcs(&cpu_info->funcs);
> > +    if (intel_pstate_msrs_not_valid())
> > +        return -ENODEV;
> > +
> > +    all_cpu_data = xzalloc_array(struct cpudata *,
> > + num_online_cpus());
> 
> This looks suspicious considering CPU hotplug, the more that this is the only
> place it gets allocated.


Will change it to " all_cpu_data = xzalloc_array(struct cpudata *, 
num_possible_cpus())"


> 
> Also please get this file formatted properly - either in Linux style or in Xen
> style, but not an arbitrary mixture of both.
> 
> > --- a/xen/include/asm-x86/acpi.h
> > +++ b/xen/include/asm-x86/acpi.h
> > @@ -32,6 +32,8 @@
> >  #define COMPILER_DEPENDENT_INT64   long long
> >  #define COMPILER_DEPENDENT_UINT64  unsigned long long
> >
> > +extern int intel_pstate_init(void);
> 
> Why in acpi.h?

I was thinking that p-state is part of ACPI. Do you prefer to create a new .h 
called cpufreq.h, just like the cpuidle.h there?

Best,
Wei

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.