[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH linux-next v2] x86/xen/time: prefer tsc as clocksource when it is invariant


  • To: Juergen Gross <jgross@xxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • From: Krister Johansen <kjlx@xxxxxxxxxxxxxxxxxx>
  • Date: Mon, 12 Dec 2022 08:05:24 -0800
  • Arc-authentication-results: i=1; rspamd-d48c5ddb-fm2bt; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@xxxxxxxxxxxxxxxxxx
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1670861128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eTQuoFAvDCg5Wk/HY/QA0RgQaxrFIn7prIyIP8P5BA8=; b=90U2dSA1XmxK/zKKqCYqtDS/pduhO49jUUBHv44rwVIwajujsIaikN02nPh12bQLNaTDsb NvVsfchnjI8dmsxGPa2zkIysMLZ28twqm6CY85+ek2DuqWLh49qXDvfUioIFMWFv/GP0QG GeyttXOHEfRB44zMa/zFEzDmmnYAKqy6Z2lDD1x3R5k1ohPcQNnHLWTAm2KdZfDVwP7uBe hvFT96xjdMzG5j1D2x1f5wRcJjZR8SJqt4PEMr73y2Pc/s4kNwjMNOJBR2/eLJQk132Rmf RR18kBSJoK3NP3GjYJkFdW+RMTU0luQwBrZ4Y+3p+smX9sE2Q4UbwgAt4fjhrA==
  • Arc-seal: i=1; s=arc-2022; d=mailchannels.net; t=1670861128; a=rsa-sha256; cv=none; b=MsWZ/aGXTmBi1PNHHjiFtZC3heWeHj8sFhzjs9rcu7L8oGvAW6yDEcnkyvfLZUN5fS02Qq vSa6Wyp1/cyHIhV4EPfTpcF1TaFYNlAD93zhGfOIf8iKnEldLhC72J/vl/NrHmsVbDeNc0 EpJJ/r9io8sQ4HY6e3VFL+J44rV2umYFdpfF3ZqaHp+/ooLflqQ+8TTFv7izfOyYZcHvRI gEwEE7Bdh3FwGykfRH2n3UrrvP8PLCpmgKwBzJTFzdsC3OvIp0ifZrkpiFC0x7q+eprdxF fC3w044UghJjmwU5veobgwBzIo/D+SEsu2vTli0PXmp0TQptHFMPyRxRaHZjqw==
  • Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, x86@xxxxxxxxxx, "H. Peter Anvin" <hpa@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, Marcelo Tosatti <mtosatti@xxxxxxxxxx>, Anthony Liguori <aliguori@xxxxxxxxxx>, David Reaver <me@xxxxxxxxxxxxxxx>, Brendan Gregg <brendan@xxxxxxxxx>
  • Delivery-date: Mon, 12 Dec 2022 16:05:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Kvm elects to use tsc instead of kvm-clock when it can detect that the
TSC is invariant.

(As of commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
invariant TSC is exposed")).

Notable cloud vendors[1] and performance engineers[2] recommend that Xen
users preferentially select tsc over xen-clocksource due the performance
penalty incurred by the latter.  These articles are persuasive and
tailored to specific use cases.  In order to understand the tradeoffs
around this choice more fully, this author had to reference the
documented[3] complexities around the Xen configuration, as well as the
kernel's clocksource selection algorithm.  Many users may not attempt
this to correctly configure the right clock source in their guest.

The approach taken in the kvm-clock module spares users this confusion,
where possible.

Both the Intel SDM[4] and the Xen tsc documentation explain that marking
a tsc as invariant means that it should be considered stable by the OS
and is elibile to be used as a wall clock source.  The Xen documentation
further clarifies that this is only reliable on HVM and PVH because PV
cannot intercept a cpuid instruction.

In order to obtain better out-of-the-box performance, and reduce the
need for user tuning, follow kvm's approach and decrease the xen clock
rating so that tsc is preferable, if it is invariant, stable, the
guest is a HVM or PVH domain, and the tsc is not emulated.

[1] 
https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/
[2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html
[3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html
[4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume
    3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC

Signed-off-by: Krister Johansen <kjlx@xxxxxxxxxxxxxxxxxx>
Code-reviewed-by: David Reaver <me@xxxxxxxxxxxxxxx>
---
v2:
  - Use cpuid information to determine if tsc is emulated.  Do not use tsc as
    clocksource if it is. (feedback from Boris Ostrovsky)
  - Move tsc checks into their own helper function
  - Add defines for tsc cpuid flags needed by new helper function.
---
 arch/x86/include/asm/xen/cpuid.h |  6 +++++
 arch/x86/xen/time.c              | 43 +++++++++++++++++++++++++++++++-
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/xen/cpuid.h b/arch/x86/include/asm/xen/cpuid.h
index 6daa9b0c8d11..d9d7432481e9 100644
--- a/arch/x86/include/asm/xen/cpuid.h
+++ b/arch/x86/include/asm/xen/cpuid.h
@@ -88,6 +88,12 @@
  *             EDX: shift amount for tsc->ns conversion
  * Sub-leaf 2: EAX: host tsc frequency in kHz
  */
+#define XEN_CPUID_TSC_EMULATED       (1u << 0)
+#define XEN_CPUID_HOST_TSC_RELIABLE  (1u << 1)
+#define XEN_CPUID_RDTSCP_INSTR_AVAIL (1u << 2)
+#define XEN_CPUID_TSC_MODE_DEFAULT   (0)
+#define XEN_CPUID_TSC_MODE_EMULATE   (1u)
+#define XEN_CPUID_TSC_MODE_NOEMULATE (2u)
 
 /*
  * Leaf 5 (0x40000x04)
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 9ef0a5cca96e..4100b1c3f38d 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -20,6 +20,7 @@
 #include <asm/pvclock.h>
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/hypercall.h>
+#include <asm/xen/cpuid.h>
 
 #include <xen/events.h>
 #include <xen/features.h>
@@ -474,15 +475,55 @@ static void xen_setup_vsyscall_time_info(void)
        xen_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK;
 }
 
+/*
+ * Check if it is possible to safely use the tsc as a clocksource.  This is 
only
+ * true if the domain is HVM or PVH, the hypervisor notifies the guest that its
+ * tsc is invariant, and the tsc instruction is not going to be emulated.
+ */
+static int __init xen_tsc_safe_clocksource(void)
+{
+       u32 eax, ebx, ecx, edx;
+
+       if (!(xen_hvm_domain() || xen_pvh_domain()))
+               return 0;
+
+       if (!(boot_cpu_has(X86_FEATURE_CONSTANT_TSC)))
+               return 0;
+
+       if (!(boot_cpu_has(X86_FEATURE_NONSTOP_TSC)))
+               return 0;
+
+       if (check_tsc_unstable())
+               return 0;
+
+       cpuid(xen_cpuid_base() + 3, &eax, &ebx, &ecx, &edx);
+
+       if (eax & XEN_CPUID_TSC_EMULATED)
+               return 0;
+
+       if (ebx != XEN_CPUID_TSC_MODE_NOEMULATE)
+               return 0;
+
+       return 1;
+}
+
 static void __init xen_time_init(void)
 {
        struct pvclock_vcpu_time_info *pvti;
        int cpu = smp_processor_id();
        struct timespec64 tp;
 
-       /* As Dom0 is never moved, no penalty on using TSC there */
+       /*
+        * As Dom0 is never moved, no penalty on using TSC there.
+        *
+        * If it is possible for the guest to determine that the tsc is a safe
+        * clocksource, then set xen_clocksource rating below that of the tsc so
+        * that the system prefers tsc instead.
+        */
        if (xen_initial_domain())
                xen_clocksource.rating = 275;
+       else if (xen_tsc_safe_clocksource())
+               xen_clocksource.rating = 299;
 
        clocksource_register_hz(&xen_clocksource, NSEC_PER_SEC);
 
-- 
2.25.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.