[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Cache Allocation Technology(CAT) design for XEN

On 12/12/14 12:27, Chao Peng wrote:
> Hi all, we plan to bring Intel CAT into XEN. This is the initial
> design for that. Comments/suggestions are welcome.
> Background
> ==========
> Traditionally, all Virtual Machines ("VMs") share the same set of system
> cache resources. There is no hardware support to control allocation or
> availability of cache resources to individual VMs. The lack of such
> partition mechanism for cache resource makes the cache utilization for
> different types of VMs inefficient. While on the other side, more and
> more cache resources become available on modern server platforms.
> With the introduction of Intel Cache Allocation Technology ("CAT"), now
> Virtualization Machine Monitor ("VMM") has the ability to partition the
> cache allocation per VM, based on the priority of VM.
> CAT Introduction
> ================
> Generally speaking, CAT introduces a mechanism for software to enable
> cache allocation based on application priority or Class of Service
> ("COS"). Cache allocation for the respective applications is then
> restricted based on the COS with which they are associated. Each COS can
> be configured using capacity bitmasks ("CBM") which represent cache
> capacity and indicate the degree of overlap and isolation between
> classes. For each logical processor there is a register
> exposed(IA32_PQR_ASSOC MSR) to allow the OS/VMM to specify a COS when an
> application, thread or VM is scheduled. Cache allocation for the
> indicated application/thread/VM is then controlled automatically by the
> hardware based on the COS and the CBM associated with that class.
> Hardware initializes COS of each logical processor to 0 and the
> corresponding CBM is set to all-ones, means all the system cache
> resource can be used for each application.
> For more information please refer to Section 17.15 in the Intel SDM [1].
> Design Overview
> ===============
> - Domain COS/CBM association
> When enforcing cache allocation for VMs, the minimum granularity is
> defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
> same COS, and therefore, correspond to the same CBM. COS is used only in
> hypervisor and is transparent to tool stack/user. System administrator
> can specify the initial CBM for each domain or change it at runtime using 
> tool stack. Hypervisor then choses a free COS to associate it with that
> CBM or find a existed COS which has the same CBM.
> - VCPU Schedule
> When VCPU is scheduled on the physical CPU ("PCPU"), its COS value is
> then written to MSR (IA32_PQR_ASSOC) of PCPU to notify hardware to use 
> the new COS. The cache allocation is then enforced by hardware.
> - Multi-Socket
> In multi-socket environment, each VCPU may be scheduled on different
> sockets. The hardware CAT ability(such as maximum supported COS and length
> of CBM) maybe different among sockets. For such system, per-socket COS/CBM
> configuration of a domain is specified. Hypervisor then use this per-socket
> CBM information for VCPU schedule.
> Implementation Description
> ==========================
> In this design, one principal is that only implementing the cache
> enforcement mechanism in hypervisor but leaving the cache allocation
> policy to user space tool stack. In this way some complex governors then
> can be implemented in tool stack. 
> In summary, hypervisor changes include:
> 1) A new field "cat_info" in domain structure to indicate the CAT
>    information for each socket. It points to array of structure:
>    struct domain_socket_cat_info {
>        unsigned int cbm; /* CBM specified by toolstack  */
>        unsigned int cos; /* COS allocated by Hypervisor */
>    }
> 2) A new SYSCTL to expose the CAT information to tool stack:
>      * Whether CAT is enabled;
>      * Max COS supported;
>      * Length of CBM;
>      * Other needed information from host cpuid;
> 3) A new DOMCTL to allow tool stack to set/get CBM for a specified domain
>    for each socket.
> 4) Context switch: write COS of domain to MSR (IA32_PQR_ASSOC) of PCPU.
> 5) XSM policy to restrict the functions visibility to control domain only.
> Hypervisor interfaces:
> 1) Boot line param: "psr=cat" to enable the feature.
> 2) SYSCTL: XEN_SYSCTL_psr_cat_op
>      - XEN_SYSCTL_PSR_CAT_INFO_GET: Get system CAT information;
> 3) DOMCTL: XEN_DOMCTL_psr_cat_op
>      - XEN_DOMCTL_PSR_CAT_OP_CBM_SET: Set CBM value for a domain.
>      - XEN_DOMCTL_PSR_CAT_OP_CBM_GET: Get CBM value for a domain.
> xl interfaces:
> 1) psr-cat-show: Show system/runtime CAT information.
> 2) psr-cat-cbm-set [dom] [cbm] [socket]: Set CBM for a domain.
> Hardware Limitation & Performance Improvement
> =============================================
> As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
> switch. If the change is frequent then hardware may fail to strictly
> enforce the cache allocation basing on the specified COS. Due to this
> the strict placement characteristic would soften if VCPU is migrated on
> different PCPU frequently.
> For this reason, lazy updating for IA32_PQR_ASSOC will be done. Also this
> design allows CAT to run in two modes:
> 1) Non Affinitized mode: Each VM can be freely scheduled on any PCPU
> assigning its COS as it does.
> 2) Affinitized mode: Each PCPU is assigned a fixed COS and a VM can be
> scheduled on the PCPU only when it has a same COS. It's less flexible
> but can be an option for those who must have strict COS placement or in
> cases where problems have arisen because of the less strict nature of the
> non-affinitized mode.
> However, no additional code is designed to support these two modes. CAT is
> already running in non affinitized mode by default. If affinitized mode
> is desirable, then existed "xl vcpu-pin" command can be used to pin all
> the VCPUs which has the same COS to certain fixed PCPUs so that these 
> PCPUs always have the same COS set.
> [1] 
> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
> Chao

Fantastic - this is a very clear and well presented document.  In terms
of a plan of action, it looks fine.

From my understanding, CAT is a largely orthogonal to CMT, but will
share some of the base PSR infrastructure in Xen?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.