[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] x86/mwait-idle: adjust the SKX C6 parameters if PC6 is disabled


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 18 Jan 2022 11:48:52 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TxbWUtXZ3qbijCMTcJyNHiBKsRg8gWYdceS9My5WQXw=; b=JojArRsLImT8Qxx/9BbCzdfudEBOD+JPGNuzz1NpjY3Fg79UufqZm27prxmyGHvntzOUw+g3wcRO3qjJ/bUfHPzkR9SSiCyayS3Fpcgjf/5q63T+dpeM98o+rZnSEH3biQw9nTqmdRuffzLy/pcNpjalG0MbW2uVqiQOsAo8boi2oOjcfs/XUjG07b5eA6+IZO4+0NZpQOlk7bA9UTAFDzrymxIiFSWa28WzQ/EOga6JusZh/a0K8Yf40+gUjb4pyDAFg1k6fq31Zwx5w7Z1mPNLUdb+qK3PCvzDAdbCVOt4jQxe317cfhqNRkMg5cWdPpQTobNSkhmYxSEadzY7Bw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TGtXCQQE16ga5/DvAPjomhtrHq1Gj0wZAHhe/xaRrn/2Fmxtri8Yx07Bmxyryxodqs5nG+5BlQOb/PI05fOjfk9Y8NgCZMpY0yqOclf9n0FAUkyd0l7zoUyzLLZSzElQ3eDrrPoMyTa4OxpZqFBydRr4k9b60d6vThFmwWfGz3TaJRf5QJs/Yogd8Xn3FObXKMuWo+MPXsvHtZnH1i3nhtQnR2cNNo3ECrgwNY+U4N2KSOZH755lLSVwY31fvjVSClAVrQ5PerLnSqGUjP87ElHlb/OLw1vMMoJJZJa/I1MCnn8gGRKJY1ETALttTs5KVu8Fg51dK7ruBgYYuv7qMA==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 18 Jan 2022 10:49:39 +0000
  • Ironport-data: A9a23:krLj1K5so0sg7J3etCOOPgxRtNnAchMFZxGqfqrLsTDasY5as4F+v jQeWD2EPqrfMGHxLtF+Ptni/UgBsZLcytU2SlBqrikzHi5G8cbLO4+Ufxz6V8+wwmwvb67FA +E2MISowBUcFyeEzvuV3zyIQUBUjclkfJKlYAL/En03FV8MpBsJ00o5wbZg294w2LBVPivW0 T/Mi5yHULOa82Yc3lI8s8pvfzs24ZweEBtB1rAPTagjUG32zhH5P7pGTU2FFFPqQ5E8IwKPb 72rIIdVXI/u10xF5tuNyt4Xe6CRK1LYFVDmZnF+A8BOjvXez8CbP2lS2Pc0MC9qZzu1c99Z1 ttQn6awaioTJ4advsoaV0JUMCpMMvgTkFPHCSDXXc27ykTHdz3nwul0DVFwNoodkgp1KTgQr 7pCcmlLN03dwbLtqF64YrAEasALNs7kMZlZonh95TrYEewnUdbIRKCiCdpwgm5u1p4fRam2i 8wxRGt/cy7yM0R2fQ0eVZgFu+vztmv+fGgNwL6SjfVuuDWCpOBr65DyNPLFd9rMQt9a9m6Iq 2SD82nnDxUyMN2E1SHD4n+qnvXIny7wRMQVDrLQ3vxgjUCXx2cTIAYLTlb9qv684nNSQPoGd RZSoHB36/Fvqgr7FbERQiFUvlaasAM2YdpzPtcjw1Cr5K3QuAq6DVoLG2sphMMdiOc6Qjkj1 1msltzvBCByvLD9dU9x5ot4vhvpZ3FLcDZqiTssCFJcvoK9+N1bYgfnF447SMaIYsvJ9SYcK txghAw3nP0tgMECzM1XFniX0mv39vAlouPYjzg7v15JDCskPOZJhKTysDA3CMqsyq7DETFtW 1BeyqCjABgmV83lqcB0aLxl8EuVz/iEKibAplVkAoMs8T+gk1b6I9wKvG0idRk1aZxZEdMMX KM1kVkOjHO0FCH7BZKbnqrrU5h6pUQePYmNug/ogipmPcEqKV7vENBGbk+MxWH9+HXAYolkU ap3hf2EVC5AYYw+lWLeb75EjdcDm35irUuOG8GT50n3gNK2OS/OIZ9YYQTmUwzMxP7eyOkj2 4wBZ5LiJtQ2eLCWXxQ7BqZIfA9adiZqVMmmwyGVH8baSjdb9KgaI6a56ZsqepB/nrQTkeHN/ 3qnXVRfxka5jnrCQThmoFg6AF82dZog/389IwI2OlOkhyoqbYq1tf9NfJorZ7g3sudkyKcsH fUCfsyBBNVJSyjGpGtBPcWs8tQ6eUT5nx+KMgqkfCM7I8xqSTvW94K2ZQDo7iQPUHa67JNsv 7262wrHapMfXAA+Xt3OYfeiwgrp73gQke5/RWXSJdxXdBm++YRmMXWp3PQ2P9sNOVPIwT7Dj 1SaBhIRpO/spY4p8YaW2fDY/tnxS+YnRxhUBWjW67qyJBL2xGv7zN8SSvuMcBDcSHjwpPeoa 9JKwqyuK/YAhltL7dZxSu450aIk6tLzjLZG1QA4Tm7TZlGmB748cHmL2c5D6v9EyrND4FbkX 0uO/p9ROKmTOdOjG1kUfVJ3YuOG3PASuz/T8fVqfxmquH4ppOKKARdIIh2BqC1BN78kYooqz NAotNMS9wHi2AEhNcyLj3wM+mmBRpDav37Lan3O7FfXtzcW
  • Ironport-hdrordr: A9a23:9N8fV6DGfp7lLgnlHeg0sceALOsnbusQ8zAXPh9KJiC9I/b1qy nxppkmPH/P6Qr4WBkb6Le90Y27MAnhHPlOkPQs1NaZLXLbUQ6TQr2KgrGSoQEIdxeOk9K1kJ 0QD5SWa+eAfGSS7/yKmTVQeuxIqLLskNHKuQ6d9QYUcegDUdAf0+4TMHf8LqQZfngjOXJvf6 Dsmfav6gDQMkg/X4CePD0oTuLDr9rEmNbPZgMHPQcu7E2rgSmz4LD3PhCE1lNGOgk/jIsKwC zgqUjU96+ju/a0xlv10HLS1Y1fnJ/ExsFYDMKBp8AJInHHixquZq5mR7qe1QpF6t2H2RIPqp 3hsh0gN8N85zf4eXy0mwLk303a3DMn+xbZuCmlqEqmhfa8aCMxCsJHi44cWADe8VAcsNZ117 8O936FtrJMZCmw0hjV1pztbVVHh0C0qX0tnao4lHpES7YTb7dXsMg24F5VKpEdByj3gbpXX9 WGNPuspMq+TGnqLEww5gJUsZ6RtzUIb1u7q3E5y42oO2M8pgE986MarPZv6UvouqhND6Ws3N 60QZiAoos+OvP+XZgNdNvpfvHHeFAlYSi8eV56cm6XXJ3uBRr22uvKCfMOlaaXRKA=
  • Ironport-sdr: BxNoaZOB672/btLLRLCzdxljOBN52+BjBKXu24QRB274o12yHwrsqrG1quVW5UtZxkram6LaQZ K/VlS8gbOdwLujs+I7OD4wCRdwXtjMn26eia536uN3DoyTEbjAS1GVCQczcYh27E+X0B/N7/YR NOIMFss+V9UEe7Im884FAiNQ6c0GLK32n03gHqv67DV2Cd3D+ftNxdkHR3IgkxsO26q/Vz3elo TaTnmsOAta1r2E36t+l6h/9Z5zWKiaquOTSBFp4YNLDK62SW2mxlzfjoyjkceJ+vEimkcH4EyN YQ0Z3q2JTYhTFxNBQBpt3GrR
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Sep 06, 2021 at 03:02:12PM +0200, Jan Beulich wrote:
> From: Chen Yu <yu.c.chen@xxxxxxxxx>
> 
> Because cpuidle assumes worst-case C-state parameters, PC6 parameters
> are used for describing C6, which is worst-case for requesting CC6.
> When PC6 is enabled, this is appropriate. But if PC6 is disabled
> in the BIOS, the exit latency and target residency should be adjusted
> accordingly.
> 
> Exit latency:
> Previously the C6 exit latency was measured as the PC6 exit latency.
> With PC6 disabled, the C6 exit latency should be the one of CC6.
> 
> Target residency:
> With PC6 disabled, the idle duration within [CC6, PC6) would make the
> idle governor choose C1E over C6. This would cause low energy-efficiency.
> We should lower the bar to request C6 when PC6 is disabled.
> 
> To fill this gap, check if PC6 is disabled in the BIOS in the
> MSR_PKG_CST_CONFIG_CONTROL(0xe2) register. If so, use the CC6 exit latency
> for C6 and set target_residency to 3 times of the new exit latency. [This
> is consistent with how intel_idle driver uses _CST to calculate the
> target_residency.] As a result, the OS would be more likely to choose C6
> over C1E when PC6 is disabled, which is reasonable, because if C6 is
> enabled, it implies that the user cares about energy, so choosing C6 more
> frequently makes sense.
> 
> The new CC6 exit latency of 92us was measured with wult[1] on SKX via NIC
> wakeup as the 99.99th percentile. Also CLX and CPX both have the same CPU
> model number as SkX, but their CC6 exit latencies are similar to the SKX
> one, 96us and 89us respectively, so reuse the SKX value for them.
> 
> There is a concern that it might be better to use a more generic approach
> instead of optimizing every platform. However, if the required code
> complexity and different PC6 bit interpretation on different platforms
> are taken into account, tuning the code per platform seems to be an
> acceptable tradeoff.
> 
> Link: 
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fintel.github.io%2Fwult%2F&amp;data=04%7C01%7Croger.pau%40citrix.com%7Ccdf115e71eb14429868408d97136a902%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637665301851513469%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=D9sIoe%2BFAvia3OuRsT19VAxkTqrlnPHnKqTSKiW6pRM%3D&amp;reserved=0
>  # [1]
> Suggested-by: Len Brown <len.brown@xxxxxxxxx>
> Signed-off-by: Chen Yu <yu.c.chen@xxxxxxxxx>
> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
> [ rjw: Subject and changelog edits ]
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> [Linux commit: 64233338499126c5c31e07165735ab5441c7e45a]
> 
> Pull in Linux'es MSR_PKG_CST_CONFIG_CONTROL. Alongside the dropping of
> "const" from skx_cstates[] add __read_mostly, and extend that to other
> similar non-const tables.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> 
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -484,7 +484,7 @@ static const struct cpuidle_state bdw_cs
>       {}
>  };
>  
> -static struct cpuidle_state skl_cstates[] = {
> +static struct cpuidle_state __read_mostly skl_cstates[] = {
>       {
>               .name = "C1-SKL",
>               .flags = MWAIT2flg(0x00),
> @@ -536,7 +536,7 @@ static struct cpuidle_state skl_cstates[
>       {}
>  };
>  
> -static const struct cpuidle_state skx_cstates[] = {
> +static struct cpuidle_state __read_mostly skx_cstates[] = {
>       {
>               .name = "C1-SKX",
>               .flags = MWAIT2flg(0x00),
> @@ -674,7 +674,7 @@ static const struct cpuidle_state knl_cs
>       {}
>  };
>  
> -static struct cpuidle_state bxt_cstates[] = {
> +static struct cpuidle_state __read_mostly bxt_cstates[] = {
>       {
>               .name = "C1-BXT",
>               .flags = MWAIT2flg(0x00),
> @@ -870,9 +870,9 @@ static void auto_demotion_disable(void *
>  {
>       u64 msr_bits;
>  
> -     rdmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits);
> +     rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits);
>       msr_bits &= ~(icpu->auto_demotion_disable_flags);
> -     wrmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits);
> +     wrmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits);
>  }
>  
>  static void byt_auto_demotion_disable(void *dummy)
> @@ -1141,7 +1141,7 @@ static void __init sklh_idle_state_table
>       if ((mwait_substates & (MWAIT_CSTATE_MASK << 28)) == 0)
>               return;
>  
> -     rdmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr);
> +     rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr);
>  
>       /* PC10 is not enabled in PKG C-state limit */
>       if ((msr & 0xF) != 8)
> @@ -1161,6 +1161,36 @@ static void __init sklh_idle_state_table
>  }
>  
>  /*
> + * skx_idle_state_table_update - Adjust the Sky Lake/Cascade Lake
> + * idle states table.
> + */
> +static void __init skx_idle_state_table_update(void)
> +{
> +     unsigned long long msr;
> +
> +     rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr);
> +
> +     /*
> +      * 000b: C0/C1 (no package C-state support)
> +      * 001b: C2
> +      * 010b: C6 (non-retention)
> +      * 011b: C6 (retention)
> +      * 111b: No Package C state limits.
> +      */
> +     if ((msr & 0x7) < 2) {

I wouldn't mind adding this mask field to msr-index.h.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.