[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] mwait-idle: add 'preferred_cstates' module argument


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 27 Apr 2022 15:41:24 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=hDt1QrWH1q4K1WrV0YJ+3InjJdZzI0wFqbANhOyodeI=; b=KeIToGRepSL+11LQ7vEBPOl5E+GUp/Lv7ZSXB16ORCHtkKgCVtVXYKJMIz9gtHHfLSURjCLH8wyrlRWngNbGZhIJCtYBFFfBea+Zt75N3KekaiW2k3n2TiPfGO9H9YptwCEpz746yVBbcWJy6bM8KHZ1VHZGN9j8IRicNLASpLk2bz3o/eKLzqFQtn2jy0Bkh5DEPtsJlHFiUdflAMub/25opI/vWmPYsAdUwOxN7hv1yDCjXdvmuOZdhKcRhRa5EpnI4U5WwN5KSG6NNM4I39Y7/hHKY16PVtBpnNuS9wmzqJVNtquNsLzj15I4dEU7lTIwlDM5RoBB4JadzXwZ2g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hWciRfbw9ivg5I9llN29LMA25ZQM8GPB7LcnBoEjZGEKp7eYMlr75j3H2netaFIjU0Ap1q4LZ/5rE+cBkQeJIlpZ+/wo6mLsYkBWJ2yKIqmWX5UaqMXdWOnfhvwnjJSX3Y1uJK7aQbOqCe73J0/4fWW6XQogPPt6GQai7rCvtJKCc8NHqx3HDcCcfDsS2rC2pdOaEtnprCofZnB3+jHDmgkAHZZtxruC63Opn3ws2fRkJLc8Dx/3BGMBVeIs2Peyn+jSb5i6oGo5rLshR0F5/IYwtmvIyZF4kX9PdFSJftUSemZU4DS7hLRJOjolyK8nwQ4G4bfiF87vZPwZwit4hg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 27 Apr 2022 13:41:34 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.04.2022 14:45, Roger Pau Monné wrote:
> On Tue, Apr 26, 2022 at 12:05:28PM +0200, Jan Beulich wrote:
>> From: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
>>
>> On Sapphire Rapids Xeon (SPR) the C1 and C1E states are basically mutually
>> exclusive - only one of them can be enabled. By default, 'intel_idle' driver
>> enables C1 and disables C1E. However, some users prefer to use C1E instead of
>> C1, because it saves more energy.
>>
>> This patch adds a new module parameter ('preferred_cstates') for enabling C1E
>> and disabling C1. Here is the idea behind it.
>>
>> 1. This option has effect only for "mutually exclusive" C-states like C1 and
>>    C1E on SPR.
>> 2. It does not have any effect on independent C-states, which do not require
>>    other C-states to be disabled (most states on most platforms as of today).
>> 3. For mutually exclusive C-states, the 'intel_idle' driver always has a
>>    reasonable default, such as enabling C1 on SPR by default. On other
>>    platforms, the default may be different.
>> 4. Users can override the default using the 'preferred_cstates' parameter.
>> 5. The parameter accepts the preferred C-states bit-mask, similarly to the
>>    existing 'states_off' parameter.
>> 6. This parameter is not limited to C1/C1E, and leaves room for supporting
>>    other mutually exclusive C-states, if they come in the future.
>>
>> Today 'intel_idle' can only be compiled-in, which means that on SPR, in order
>> to disable C1 and enable C1E, users should boot with the following kernel
>> argument: intel_idle.preferred_cstates=4
>>
>> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
>> da0e58c038e6
>>
>> Enable C1E (if requested) not only on the BSP's socket / package.
> 
> Maybe we should also add a note here that the command line option for
> Xen is preferred-cstates instead of intel_idle.preferred_cstates?
> 
> I think this is a bad interface however, we should have a more generic
> option (ie: cstate-mode = 'performance | powersave') so that users
> don't have to fiddle with model specific C state masks.

Performance vs powersave doesn't cover it imo, especially if down
the road more states would appear which can be controlled this way.
I don't think there's a way around providing _some_ way to control
things one a per-state level. When porting this over, I too didn't
like this interface very much, but I had no good replacement idea.

>> --- unstable.orig/xen/arch/x86/cpu/mwait-idle.c
>> +++ unstable/xen/arch/x86/cpu/mwait-idle.c
>> @@ -82,6 +82,18 @@ boolean_param("mwait-idle", opt_mwait_id
>>  
>>  static unsigned int mwait_substates;
>>  
>> +/*
>> + * Some platforms come with mutually exclusive C-states, so that if one is
>> + * enabled, the other C-states must not be used. Example: C1 and C1E on
>> + * Sapphire Rapids platform. This parameter allows for selecting the
>> + * preferred C-states among the groups of mutually exclusive C-states - the
>> + * selected C-states will be registered, the other C-states from the 
>> mutually
>> + * exclusive group won't be registered. If the platform has no mutually
>> + * exclusive C-states, this parameter has no effect.
>> + */
>> +static unsigned int __ro_after_init preferred_states_mask;
>> +integer_param("preferred-cstates", preferred_states_mask);
>> +
>>  #define LAPIC_TIMER_ALWAYS_RELIABLE 0xFFFFFFFF
>>  /* Reliable LAPIC Timer States, bit 1 for C1 etc. Default to only C1. */
>>  static unsigned int lapic_timer_reliable_states = (1 << 1);
>> @@ -96,6 +108,7 @@ struct idle_cpu {
>>      unsigned long auto_demotion_disable_flags;
>>      bool byt_auto_demotion_disable_flag;
>>      bool disable_promotion_to_c1e;
>> +    bool enable_promotion_to_c1e;
> 
> I'm confused by those fields, shouldn't we just have:
> promotion_to_c1e = true | false?
> 
> As one field is the negation of the other:
> enable_promotion_to_c1e = !disable_promotion_to_c1e
> 
> I know this is code from Linux, but would like to understand why two
> fields are needed.

This really is a tristate; Linux is now changing their global variable
to an enum, but we don't have an equivalent of that global variable.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.