|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 3/4] mwait-idle: add 'preferred_cstates' module argument
On Tue, Apr 26, 2022 at 12:05:28PM +0200, Jan Beulich wrote:
> From: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
>
> On Sapphire Rapids Xeon (SPR) the C1 and C1E states are basically mutually
> exclusive - only one of them can be enabled. By default, 'intel_idle' driver
> enables C1 and disables C1E. However, some users prefer to use C1E instead of
> C1, because it saves more energy.
>
> This patch adds a new module parameter ('preferred_cstates') for enabling C1E
> and disabling C1. Here is the idea behind it.
>
> 1. This option has effect only for "mutually exclusive" C-states like C1 and
> C1E on SPR.
> 2. It does not have any effect on independent C-states, which do not require
> other C-states to be disabled (most states on most platforms as of today).
> 3. For mutually exclusive C-states, the 'intel_idle' driver always has a
> reasonable default, such as enabling C1 on SPR by default. On other
> platforms, the default may be different.
> 4. Users can override the default using the 'preferred_cstates' parameter.
> 5. The parameter accepts the preferred C-states bit-mask, similarly to the
> existing 'states_off' parameter.
> 6. This parameter is not limited to C1/C1E, and leaves room for supporting
> other mutually exclusive C-states, if they come in the future.
>
> Today 'intel_idle' can only be compiled-in, which means that on SPR, in order
> to disable C1 and enable C1E, users should boot with the following kernel
> argument: intel_idle.preferred_cstates=4
>
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> da0e58c038e6
>
> Enable C1E (if requested) not only on the BSP's socket / package.
Maybe we should also add a note here that the command line option for
Xen is preferred-cstates instead of intel_idle.preferred_cstates?
I think this is a bad interface however, we should have a more generic
option (ie: cstate-mode = 'performance | powersave') so that users
don't have to fiddle with model specific C state masks.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>
> --- unstable.orig/docs/misc/xen-command-line.pandoc 2022-04-25
> 17:59:42.123387258 +0200
> +++ unstable/docs/misc/xen-command-line.pandoc 2022-04-25
> 17:36:00.000000000 +0200
> @@ -1884,6 +1884,12 @@ paging controls access to usermode addre
> ### ple_window (Intel)
> > `= <integer>`
>
> +### preferred-cstates (x86)
> +> `= <integer>`
> +
> +This is a mask of C-states which are to be use preferably. This option is
> +applicable only oh hardware were certain C-states are exlusive of one
> another.
> +
> ### psr (Intel)
> > `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> |
> cos_max:<integer> | cdp:<boolean> )`
>
> --- unstable.orig/xen/arch/x86/cpu/mwait-idle.c 2022-04-25
> 17:17:05.000000000 +0200
> +++ unstable/xen/arch/x86/cpu/mwait-idle.c 2022-04-25 17:33:47.000000000
> +0200
> @@ -82,6 +82,18 @@ boolean_param("mwait-idle", opt_mwait_id
>
> static unsigned int mwait_substates;
>
> +/*
> + * Some platforms come with mutually exclusive C-states, so that if one is
> + * enabled, the other C-states must not be used. Example: C1 and C1E on
> + * Sapphire Rapids platform. This parameter allows for selecting the
> + * preferred C-states among the groups of mutually exclusive C-states - the
> + * selected C-states will be registered, the other C-states from the mutually
> + * exclusive group won't be registered. If the platform has no mutually
> + * exclusive C-states, this parameter has no effect.
> + */
> +static unsigned int __ro_after_init preferred_states_mask;
> +integer_param("preferred-cstates", preferred_states_mask);
> +
> #define LAPIC_TIMER_ALWAYS_RELIABLE 0xFFFFFFFF
> /* Reliable LAPIC Timer States, bit 1 for C1 etc. Default to only C1. */
> static unsigned int lapic_timer_reliable_states = (1 << 1);
> @@ -96,6 +108,7 @@ struct idle_cpu {
> unsigned long auto_demotion_disable_flags;
> bool byt_auto_demotion_disable_flag;
> bool disable_promotion_to_c1e;
> + bool enable_promotion_to_c1e;
I'm confused by those fields, shouldn't we just have:
promotion_to_c1e = true | false?
As one field is the negation of the other:
enable_promotion_to_c1e = !disable_promotion_to_c1e
I know this is code from Linux, but would like to understand why two
fields are needed.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |