|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v2 3/3] x86/vmx: implement Notify VM Exit
+Chenyi/Xiaoyao who worked on the KVM support. Presumably
similar opens have been discussed in KVM hence they have the
right background to comment here.
> From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Sent: Thursday, May 26, 2022 7:12 PM
>
> Under certain conditions guests can get the CPU stuck in an unbounded
> loop without the possibility of an interrupt window to occur on
> instruction boundary. This was the case with the scenarios described
> in XSA-156.
>
> Make use of the Notify VM Exit mechanism, that will trigger a VM Exit
> if no interrupt window occurs for a specified amount of time. Note
> that using the Notify VM Exit avoids having to trap #AC and #DB
> exceptions, as Xen is guaranteed to get a VM Exit even if the guest
> puts the CPU in a loop without an interrupt window, as such disable
> the intercepts if the feature is available and enabled.
>
> Setting the notify VM exit window to 0 is safe because there's a
> threshold added by the hardware in order to have a sane window value.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
> Changes since v1:
> - Properly update debug state when using notify VM exit.
> - Reword commit message.
> ---
> This change enables the notify VM exit by default, KVM however doesn't
> seem to enable it by default, and there's the following note in the
> commit message:
>
> "- There's a possibility, however small, that a notify VM exit happens
> with VM_CONTEXT_INVALID set in exit qualification. In this case, the
> vcpu can no longer run. To avoid killing a well-behaved guest, set
> notify window as -1 to disable this feature by default."
>
> It's not obviously clear to me whether the comment was meant to be:
> "There's a possibility, however small, that a notify VM exit _wrongly_
> happens with VM_CONTEXT_INVALID".
>
> It's also not clear whether such wrong hardware behavior only affects
> a specific set of hardware, in a way that we could avoid enabling
> notify VM exit there.
>
> There's a discussion in one of the Linux patches that 128K might be
> the safer value in order to prevent false positives, but I have no
> formal confirmation about this. Maybe our Intel maintainers can
> provide some more feedback on a suitable notify VM exit window
> value.
>
> I've tested with 0 (the proposed default in the patch) and I don't
> seem to be able to trigger notify VM exits under normal guest
> operation. Note that even in that case the guest won't be destroyed
> unless the context is corrupt.
> ---
> docs/misc/xen-command-line.pandoc | 11 +++++++++
> xen/arch/x86/hvm/vmx/vmcs.c | 19 +++++++++++++++
> xen/arch/x86/hvm/vmx/vmx.c | 32 +++++++++++++++++++++++--
> xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 4 ++++
> xen/arch/x86/include/asm/hvm/vmx/vmx.h | 6 +++++
> xen/arch/x86/include/asm/perfc_defn.h | 3 ++-
> 6 files changed, 72 insertions(+), 3 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-
> command-line.pandoc
> index 1dc7e1ca07..ccf8bf5806 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -2544,6 +2544,17 @@ guest will notify Xen that it has failed to acquire a
> spinlock.
> <major>, <minor> and <build> must be integers. The values will be
> encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled.
>
> +### vm-notify-window (Intel)
> +> `= <integer>`
> +
> +> Default: `0`
> +
> +Specify the value of the VM Notify window used to detect locked VMs. Set
> to -1
> +to disable the feature. Value is in units of crystal clock cycles.
> +
> +Note the hardware might add a threshold to the provided value in order to
> make
> +it safe, and hence using 0 is fine.
> +
> ### vpid (Intel)
> > `= <boolean>`
>
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index d388e6729c..6cb2c6c6b7 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -67,6 +67,9 @@ integer_param("ple_gap", ple_gap);
> static unsigned int __read_mostly ple_window = 4096;
> integer_param("ple_window", ple_window);
>
> +static unsigned int __ro_after_init vm_notify_window;
> +integer_param("vm-notify-window", vm_notify_window);
> +
> static bool __read_mostly opt_ept_pml = true;
> static s8 __read_mostly opt_ept_ad = -1;
> int8_t __read_mostly opt_ept_exec_sp = -1;
> @@ -210,6 +213,7 @@ static void __init vmx_display_features(void)
> P(cpu_has_vmx_pml, "Page Modification Logging");
> P(cpu_has_vmx_tsc_scaling, "TSC Scaling");
> P(cpu_has_vmx_bus_lock_detection, "Bus Lock Detection");
> + P(cpu_has_vmx_notify_vm_exiting, "Notify VM Exit");
> #undef P
>
> if ( !printed )
> @@ -329,6 +333,8 @@ static int vmx_init_vmcs_config(bool bsp)
> opt |= SECONDARY_EXEC_UNRESTRICTED_GUEST;
> if ( opt_ept_pml )
> opt |= SECONDARY_EXEC_ENABLE_PML;
> + if ( vm_notify_window != ~0u )
> + opt |= SECONDARY_EXEC_NOTIFY_VM_EXITING;
>
> /*
> * "APIC Register Virtualization" and "Virtual Interrupt Delivery"
> @@ -1333,6 +1339,19 @@ static int construct_vmcs(struct vcpu *v)
> rc = vmx_add_msr(v, MSR_FLUSH_CMD, FLUSH_CMD_L1D,
> VMX_MSR_GUEST_LOADONLY);
>
> + if ( cpu_has_vmx_notify_vm_exiting )
> + {
> + __vmwrite(NOTIFY_WINDOW, vm_notify_window);
> + /*
> + * Disable #AC and #DB interception: by using VM Notify Xen is
> + * guaranteed to get a VM exit even if the guest manages to lock the
> + * CPU.
> + */
> + v->arch.hvm.vmx.exception_bitmap &= ~((1U << TRAP_debug) |
> + (1U << TRAP_alignment_check));
> + vmx_update_exception_bitmap(v);
> + }
> +
> out:
> vmx_vmcs_exit(v);
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 69980c8e31..d3c1597b3e 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1419,10 +1419,19 @@ static void cf_check vmx_update_host_cr3(struct
> vcpu *v)
>
> void vmx_update_debug_state(struct vcpu *v)
> {
> + unsigned int mask = 1u << TRAP_int3;
> +
> + if ( !cpu_has_monitor_trap_flag && cpu_has_vmx_notify_vm_exiting )
> + /*
> + * Only allow toggling TRAP_debug if notify VM exit is enabled, as
> + * unconditionally setting TRAP_debug is part of the XSA-156 fix.
> + */
> + mask |= 1u << TRAP_debug;
> +
> if ( v->arch.hvm.debug_state_latch )
> - v->arch.hvm.vmx.exception_bitmap |= 1U << TRAP_int3;
> + v->arch.hvm.vmx.exception_bitmap |= mask;
> else
> - v->arch.hvm.vmx.exception_bitmap &= ~(1U << TRAP_int3);
> + v->arch.hvm.vmx.exception_bitmap &= ~mask;
>
> vmx_vmcs_enter(v);
> vmx_update_exception_bitmap(v);
> @@ -4155,6 +4164,9 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
> switch ( vector )
> {
> case TRAP_debug:
> + if ( cpu_has_monitor_trap_flag && cpu_has_vmx_notify_vm_exiting )
> + goto exit_and_crash;
> +
> /*
> * Updates DR6 where debugger can peek (See 3B 23.2.1,
> * Table 23-1, "Exit Qualification for Debug Exceptions").
> @@ -4593,6 +4605,22 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
> */
> break;
>
> + case EXIT_REASON_NOTIFY:
> + __vmread(EXIT_QUALIFICATION, &exit_qualification);
> +
> + if ( exit_qualification & NOTIFY_VM_CONTEXT_INVALID )
> + {
> + perfc_incr(vmnotify_crash);
> + gprintk(XENLOG_ERR, "invalid VM context after notify vmexit\n");
> + domain_crash(v->domain);
> + break;
> + }
> +
> + if ( unlikely(exit_qualification &
> INTR_INFO_NMI_UNBLOCKED_BY_IRET) )
> + undo_nmis_unblocked_by_iret();
> +
> + break;
> +
> case EXIT_REASON_VMX_PREEMPTION_TIMER_EXPIRED:
> case EXIT_REASON_INVPCID:
> /* fall through */
> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> index 5d3edc1642..0961eabf3f 100644
> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> @@ -267,6 +267,7 @@ extern u32 vmx_vmentry_control;
> #define SECONDARY_EXEC_XSAVES 0x00100000
> #define SECONDARY_EXEC_TSC_SCALING 0x02000000
> #define SECONDARY_EXEC_BUS_LOCK_DETECTION 0x40000000
> +#define SECONDARY_EXEC_NOTIFY_VM_EXITING 0x80000000
> extern u32 vmx_secondary_exec_control;
>
> #define VMX_EPT_EXEC_ONLY_SUPPORTED 0x00000001
> @@ -348,6 +349,8 @@ extern u64 vmx_ept_vpid_cap;
> (vmx_secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
> #define cpu_has_vmx_bus_lock_detection \
> (vmx_secondary_exec_control &
> SECONDARY_EXEC_BUS_LOCK_DETECTION)
> +#define cpu_has_vmx_notify_vm_exiting \
> + (vmx_secondary_exec_control &
> SECONDARY_EXEC_NOTIFY_VM_EXITING)
>
> #define VMCS_RID_TYPE_MASK 0x80000000
>
> @@ -455,6 +458,7 @@ enum vmcs_field {
> SECONDARY_VM_EXEC_CONTROL = 0x0000401e,
> PLE_GAP = 0x00004020,
> PLE_WINDOW = 0x00004022,
> + NOTIFY_WINDOW = 0x00004024,
> VM_INSTRUCTION_ERROR = 0x00004400,
> VM_EXIT_REASON = 0x00004402,
> VM_EXIT_INTR_INFO = 0x00004404,
> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> index bc0caad6fb..e429de8541 100644
> --- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> @@ -221,6 +221,7 @@ static inline void pi_clear_sn(struct pi_desc *pi_desc)
> #define EXIT_REASON_XSAVES 63
> #define EXIT_REASON_XRSTORS 64
> #define EXIT_REASON_BUS_LOCK 74
> +#define EXIT_REASON_NOTIFY 75
> /* Remember to also update VMX_PERF_EXIT_REASON_SIZE! */
>
> /*
> @@ -236,6 +237,11 @@ static inline void pi_clear_sn(struct pi_desc *pi_desc)
> #define INTR_INFO_VALID_MASK 0x80000000 /* 31 */
> #define INTR_INFO_RESVD_BITS_MASK 0x7ffff000
>
> +/*
> + * Exit Qualifications for NOTIFY VM EXIT
> + */
> +#define NOTIFY_VM_CONTEXT_INVALID 1u
> +
> /*
> * Exit Qualifications for MOV for Control Register Access
> */
> diff --git a/xen/arch/x86/include/asm/perfc_defn.h
> b/xen/arch/x86/include/asm/perfc_defn.h
> index d6eb661940..c6b601b729 100644
> --- a/xen/arch/x86/include/asm/perfc_defn.h
> +++ b/xen/arch/x86/include/asm/perfc_defn.h
> @@ -6,7 +6,7 @@ PERFCOUNTER_ARRAY(exceptions, "exceptions", 32)
>
> #ifdef CONFIG_HVM
>
> -#define VMX_PERF_EXIT_REASON_SIZE 75
> +#define VMX_PERF_EXIT_REASON_SIZE 76
> #define VMEXIT_NPF_PERFC 143
> #define SVM_PERF_EXIT_REASON_SIZE (VMEXIT_NPF_PERFC + 1)
> PERFCOUNTER_ARRAY(vmexits, "vmexits",
> @@ -126,5 +126,6 @@ PERFCOUNTER(realmode_exits, "vmexits from
> realmode")
> PERFCOUNTER(pauseloop_exits, "vmexits from Pause-Loop Detection")
>
> PERFCOUNTER(buslock, "Bus Locks Detected")
> +PERFCOUNTER(vmnotify_crash, "domains crashed by Notify VM Exit")
>
> /*#endif*/ /* __XEN_PERFC_DEFN_H__ */
> --
> 2.36.0
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |