[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Alternate p2m design specification
This document describes a new capability for VM Introspection, Security and Privacy in Xen. The new capability is called âaltp2mâ (short for Alternate p2m) that is used to provide the ability for Xen to host alternate guest physical memory domains for a specific guest-domain. This document describes the overall design specific to Xen for your review and feedback. Background ========= Intel VT-x2 CPUs support Extended Page Tables (EPTs). Extended Page Tables allow the VMM to restrict permissions for guest physical pages accessed by software operating in the guest (VMX-non-root). The p2m capability in Xen abstracts the architecture-specific details of EPTs. Typically, Xen manages a single p2m for a specific guest domain. ALTP2M Introduction ================ The altp2m capability enables management of multiple (alternate) p2ms per HVM guest domain thus allowing for separate physical memory domains per guest. The altp2m capability allows for para-virtualized guest software agent within or across domains to be able to enforce memory introspection policies in an efficient manner. Altp2m also allows para-virtualized guest agent components to be isolated within an HVM (in terms of guest physical memory) for secure VM introspection as well as various other security and privacy usages that require efficient memory isolation. Two related Intel CPU features are utilized as performance enhancement capabilities within the altp2m module when operating with an in-domain agent. The altp2m module opportunistically uses these assists when enumerated on the CPU. Operations that require frequent switching between p2m domains can incur a high overhead if done via legacy approaches such as via a hypercall. VM Functions (VMFUNC) is a new VT-x instruction on Intel's 4th gen Core (Haswell) and Atom (Silvermont) CPUs. In general, VMFUNC is targeted to reduce overhead of services provided by the CPU to an HVM guest (once configured by the VMM) â one such leaf (0) is defined is to provide a low latency p2m switching (EPTP Switching in Intel terminology) capability. VMFUNC leaf 0 is enabled as part of the altp2m functionality to allow para-virtualized agents in an HVM to apply custom p2m domain switching policies without incurring overheads due to VM Exits. #VE (Virtualization Exception) is a feature introduced on Intelâs 5th gen Core (Broadwell) and Atom (Goldmont) CPUs. #VE is a CPU assist defined to allow the VMM to convert EPT violations for specific guest physical page accesses to a guest-IDT-delivered exception (new vector 20), and thus reduce the latency for managing VM introspection policies for guest memory read, write and/or execute attempts â these are induced events configured by a para-virtualized security agent monitoring guest memory accesses based on its isolation/monitoring policies. In legacy (pre-#VE) CPUs, EPT violations require a VM Exit and frequent induced EPT violations can add high hypervisor overhead. #VE reduces the impact of this overhead, whilst reducing the amount of guest-specific policy context to be inserted into the VMM. Both VMFUNC and #VE are designed such that a VMM can emulate them on legacy CPUs. The altp2m module includes full emulation of VMFUNC leaf 0 and #VE, so in-domain agents can be written to assume both capabilities are available on all hardware. VMFUNC Introduction ================= VMFUNC leaf 0 for EPTP-Switching is a hardware-assisted efficient way to switch EPTs configured by the VMM. Software in a Xen guest domain may invoke a VM function with the VMFUNC instruction; the value of EAX selects the specific VM function being invoked. The VMM enables VM functions generally by setting the âenable VM functionsâ VM-execution control. A specific VM function is enabled by setting the corresponding VM-function control. When software wants to enable EPTP switching (VM function 0) it must set the âactivate secondary controlsâ VM-execution control (bit 31 of the primary processor-based VM-execution controls), the âenable VM functionsâ VM-execution control (bit 13 of the secondary processor-based VMexecution controls) and the âEPTP switchingâ VM-function control (bit 0 of the VM-function controls). The VMFUNC instruction causes an invalid-opcode exception (#UD) if the âenable VM functionsâ VM-execution controls is 0 or the value of EAX is greater than 63 (only VM functions 0â63 can be enabled). Otherwise, the instruction causes a VM exit if the bit at position EAX is 0 in the VM-function controls (the selected VM Function is not enabled). If such a VM exit occurs, the basic exit reason used is 59 (3BH), indicating âVMFUNCâ, and the length of the VMFUNC instruction is saved into the VM-exit instruction-length field. If the instruction causes neither an invalid-opcode exception nor a VM exit due to a disabled VM function, it performs the functionality of the VM function specified by the value in EAX. VMFUNC leaf 0/EPTP switching allows guest software to load a new value for the EPT pointer (EPTP), thereby establishing a different EPT paging-structure hierarchy. Guest software is limited to selecting from a list of potential EPTP values configured in advance by the VMM. Specifically, the value of ECX is used to select an entry from an EPTP list, a 4-KByte structure referenced by the EPTP-list address (a new control field in the VMCS). VMFUNC causes a VM exit for error conditions such as if ECX â 512. If the selected entry is a valid EPTP value (i.e. the EPTP would not cause VM entry to fail), it is stored in the EPTP field of the current VMCS and is used for subsequent accesses using guest-physical addresses. The complete spec of VMFUNC can be found in chapter 25.5.5 of the Intel SDM at: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html #VE Introduction ============= A virtualization exception is a new processor exception. It uses vector 20 and is abbreviated #VE. A virtualization exception can occur only in VMX non-root operation. The 1-setting of the âEPT-violation #VEâ VM-execution control causes some EPT violations to generate virtualization exceptions instead of VM exits. The VMM manages how the processor determines whether an EPT violation causes a virtualization exception or a VM exit. When the processor encounters a virtualization exception, it saves information about the exception to the virtualization-exception information area (hosted in a 4Kb page referenced by a new field in the VMCS). After saving virtualization-exception information, the processor delivers a virtualization exception as it would any other exception. The values of certain EPT paging-structure entries determine which EPT violations are convertible. Specifically, bit 63 of certain EPT paging-structure entries is defined to suppress #VE â effectively, an EPT violation is convertible to #VE if and only if bit 63 of the EPT entry that caused the EPT violation is 0. Note that EPT misconfiguration behavior does not change and always cause VM exits. The complete spec of #VE can be found in chapters 25.5.6 of the Intel SDM at: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html With VMFUNC and #VE, the Xen hypervisor does not have to be involved for handling guest VM-introspection policies, which reduces hypervisor overhead, complexity (TCB), and would work well with VM migration. For a guest domain using VMFUNC and #VE, more CPU cycles can be allocated to guest, so benchmarks in guest domain using VM-introspection with VMFUNC and #VE enabled will have better performance comparing to non-VMFUNC/#VE CPU. Design ====== - Altp2m feature enabled via opt-in parameter A new Xen boot parameter, 'altp2m', is introduced to control altp2m on a global basis â this parameter defaults to 0 (disabled). - Altp2m enable/disable for particular domain Additionally, a new domain parameter, 'altp2mhvm', is introduced to control altp2m for an individual HVM domain â this parameter also defaults to 0 (disabled). Both parameters must be set to 1 (enabled) before altp2m functionality is available in a given domain. At any point in time, altp2m is enabled for all vcpus of a domain or disabled for all vcpus of that domain. Alternate EPT tables created for the alternate p2m are shared by all vcpus assigned to a domain. Altp2m mode may be dynamically enabled/disabled for a domain. - Hypercalls for altp2m Altp2m mode introduces a new set of hypercalls for altp2m management from software agents operating in Xen HVM guests. The hypercalls are as follows: Enable or Disable altp2m mode for domain Create a new alternate p2m Edit permissions for a specific GPA within an alternate p2m Destroy an existing alternate p2m - Core altp2m functionality A new altp2m type is added to the p2m types (in addition to the previous hostp2m and nestedp2m types). An HVM domain can be started in hostp2m mode and switched over into altp2m mode via a hypercall. Once a HVM domain is in altp2m mode, a set of (currently set size is 10) altp2m objects is managed by Xen. Altp2m updates are performed in a lazy manner â in effect, the altp2m reflects the same EPT attributes for mappings accessed as the hostp2m unless the permissions for a GPA are modified by the guest agent (for a specific altp2m) â currently, page permissions and mappings for memory type ram_rw only can be modified via the altp2m hypercall. By default, all GPA mappings are set to suppress #VE (resulting in legacy behavior for Xen); #VE is un-suppressed for a GPA when the in-domain guest agent invokes an altp2m hypercall to modify the permission of a GPA. A subsequent guest access to the GPA that violates the agent-specified EPT permissions will cause a #VE (instead of an EPT viola tion) that is expected to be handled by the guest software. One of the valid responses to a #VE event in the guest, is to switch altp2m's to activate a different set of GPA permissions and mappings. Using VMFUNC, this switch can be achieved efficiently for the single vcpu on which the permissions violation occurs. There is also a hypercall to switch altp2m's for every vcpu in a domain, as is typically required during agent initialisation. The list of altp2m's is protected by a separate list lock, which must be held during any operations which could change the state of an altp2m from valid to invalid or vice-versa, or when performing any modification to an altp2m which is not the current p2m for the current vcpu. Many operations that must acquire the altp2m list lock occur in code paths where the hostp2m lock has already been acquired. To avoid locking order violations, the p2m lock has been split into two types: altp2ms have a lock type which is lower in the order than other p2m's; and the altp2m list lock is placed between the two. - VMExit handler for VMFUNC When altp2m is enabled on a CPU with VMFUNC enumerated, an erroneous VMFUNC may cause a VM exit with exit reason âVMFUNCâ. A new exit handler is added for this exit reason, which injects #UD into the guest. - Support for intra-domain and inter-domain VM introspection (and XSM) The altp2m functionality allows the capability to be used via an agent operating in an HVM guest or alternately an agent operating in a separate privileged domain. For cross domain operation, an XSM hook is defined such that the administrator can define a policy for inter-domain VM introspection. The way in which permissions violations are reported to an in-domain agent and the expected agent response have been described above. Restrictions imposed by an out-of-domain agent do not have suppress-#VE removed, so they always result in a VM exit. The violation is reported through the existing VM Event mechanism, modified to indicate that the event is an altp2m event and include the current altp2m index; the response can force a change to a different altp2m for the relevant vcpu before VM entry. If an in-domain agent places an altp2m restriction and a violation of that restriction occurs on a vcpu that cannot receive #VE, that will cause of VM exit that will be treated as if the restriction had been imposed by an out-of-domain agent. ---------------- END -------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |