[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document
On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote: > +* xl interfaces: > + > + 1. `psr-mba-show [domain-id]`: Is this limited to domain-id, or one can also use the domain name? Most of the xl commands accept either a domain-id or a domain-name. > + > + Show memory bandwidth throttling for domain. Under different modes, it > + shows different type of data. > + > + There are two modes: > + Linear mode: the response of throttling value is linear. > + Non-linear mode: the response of throttling value is non-linear. > + > + For linear mode, it shows the decimal value. For non-linear mode, it > shows > + hexadecimal value. > + > + 2. `psr-mba-set [OPTIONS] <domain-id> <throttling>`: > + > + Set memory bandwidth throttling for domain. > + > + Options: > + '-s': Specify the socket to process, otherwise all sockets are > processed. > + > + Throttling value set in register implies the approximate amount of > delaying > + the traffic between core and memory. The higher throttling value > results in > + lower bandwidth. The max throttling value (MBA_MAX) supported can be got s/got/obtained/ > + through CPUID. How can one get this value empirically? Do I need to use a external tool? > + > + Linear mode: the input precision is defined as 100-(MBA_MAX). For > instance, > + if the MBA_MAX value is 90, the input precision is 10%. Values not an > even > + multiple of the precision (e.g., 12%) will be rounded down (e.g., to 10% > + delay applied) by HW automatically. > + > + Non-linear mode: input delay values are powers-of-two from zero to the > + MBA_MAX value from CPUID. In this case any values not a power of two > will > + be rounded down the next nearest power of two by HW automatically. Both of the above descriptions should be moved to mba-show IMHO, the description there is incomplete and not helpful. > + > +# Technical details > + > +MBA is a member of Intel PSR features, it shares the base PSR infrastructure > +in Xen. > + > +## Hardware perspective > + > + MBA defines a range of MSRs to support specifying a delay value (Thrtl) per > + COS, with details below. > + > + ``` > + +----------------------------+----------------+ > + | MSR (per socket) | Address | > + +----------------------------+----------------+ > + | IA32_L2_QOS_Ext_BW_Thrtl_0 | 0xD50 | > + +----------------------------+----------------+ > + | ... | ... | > + +----------------------------+----------------+ > + | IA32_L2_QOS_Ext_BW_Thrtl_n | 0xD50+n | > + +----------------------------+----------------+ > + ``` > + > + When context switch happens, the COS ID of domain is written to per-thread > MSR > + `IA32_PQR_ASSOC`, and then hardware enforces bandwidth allocation according I think this is missing some context of the relation between a thread and the MSR. I assume it's related to IA32_PQR_ASSOC, but I have no idea what that constant means. What's more, Xen doesn't have threads, so you should maybe speak about vCPUs instead? > + to the throttling value stored in the Thrtl MSR register. > + > +## The relationship between MBA and CAT/CDP > + > + Generally speaking, MBA is completely independent of CAT/CDP, and any > + combination may be applied at any time, e.g. enabling MBA with CAT > + disabled. > + > + But it needs to be noticed that MBA shares COS infrastructure with CAT, > + although MBA is enumerated by different CPUID leaf from CAT (which > + indicates that the max COS of MBA may be different from CAT). In some > + cases, a domain is permitted to have a COS that is beyond one (or more) > + of PSR features but within the others. For instance, let's assume the max > + COS of MBA is 8 but the max COS of L3 CAT is 16, when a domain is assigned > + 9 as COS, the L3 CAT CBM associated to COS 9 would be enforced, but for > MBA, > + the HW works as default value is set since COS 9 is beyond the max COS (8) > + of MBA. > + > +## Design Overview > + > +* Core COS/Thrtl association > + > + When enforcing Memory Bandwidth Allocation, all cores of domains have > + the same default Thrtl MSR (COS0) which stores the same Thrtl (0). The > + default Thrtl MSR is used only in hypervisor and is transparent to tool > stack > + and user. > + > + System administrators can change PSR allocation policy at runtime by > + using the tool stack. Since MBA shares COS ID with CAT/CDP, a COS ID > + corresponds to a 2-tuple, like [CBM, Thrtl] with only-CAT enabled, when CDP > + is enabled, the COS ID corresponds to a 3-tuple, like [Code_CBM, Data_CBM, > + Thrtl]. If neither CAT nor CDP is enabled, things are easier, since one COS > + ID corresponds to one Thrtl. > + > +* VCPU schedule > + > + This part reuses CAT COS infrastructure. > + > +* Multi-sockets > + > + Different sockets may have different MBA ability (like max COS) > + although it is consistent on the same socket. So the capability > + of per-socket MBA is specified. > + > + This part reuses CAT COS infrastructure. > + > +## Implementation Description > + > +* Hypervisor interfaces: > + > + 1. Boot line param: "psr=mba" to enable the feature. > + > + 2. SYSCTL: > + - XEN_SYSCTL_PSR_MBA_get_info: Get system MBA information. So this is likely how one gets the mentioned MBA_MAX? > + > + 3. DOMCTL: > + - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain. > + - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain. > + > +* xl interfaces: > + > + 1. psr-mba-show [domain-id] > + Show system/domain runtime MBA throttling value. For linear mode, > + it shows the decimal value. For non-linear mode, it shows > hexadecimal > + value. > + => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL > + > + 2. psr-mba-set [OPTIONS] <domain-id> <throttling> > + Set bandwidth throttling for a domain. > + => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL > + > + 3. psr-hwinfo > + Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA. > + => XEN_SYSCTL_PSR_MBA_get_info 'psr-hwinfo' seems to be completely missing from the 'xl interfaces:' section above. > +* Key data structure: > + > + 1. Feature HW info > + > + ``` > + struct { > + unsigned int thrtl_max; > + bool linear; > + } mba; > + > + - Member `thrtl_max` > + > + `thrtl_max` is the max throttling value to be set, i.e. MBA_MAX. > + > + - Member `linear` > + > + `linear` means the response of delay value is linear or not. > + > + As mentioned above, MBA is a member of Intel PSR features, it would > + share the base PSR infrastructure in Xen. For example, the 'cos_max' > + is a common HW property for all features. So, for other data structure > + details, please refer 'intel_psr_cat_cdp.pandoc'. ^ to > + > +# Limitations > + > +MBA can only work on HW which enables it (check by CPUID). ^ s/enables/supports/. > + > +# Testing > + > +We can execute these commands to verify MBA on different HWs supporting them. > + > +For example: > + 1. User can get the MBA hardware info through 'psr-hwinfo' command. From > + result, user can know if this hardware works under linear mode or non- > + linear mode, the max throttling value (MBA_MAX) and so on. > + > + root@:~$ xl psr-hwinfo --mba > + Memory Bandwidth Allocation (MBA): > + Socket ID : 0 > + Linear Mode : Enabled > + Maximum COS : 7 > + Maximum Throttling Value: 90 > + Default Throttling Value: 0 > + > + 2. Then, user can set a throttling value to a domain. For example, set > '0xa', > + i.e 10% delay. > + > + root@:~$ xl psr-mba-set 1 0xa > + > + 3. User can check the current configuration of the domain through > + 'psr-mab-show'. For linear mode, the decimal value is shown. > + > + root@:~$ xl psr-mba-show 1 > + Socket ID : 0 > + Default THRTL : 0 > + ID NAME THRTL > + 1 ubuntu14 10 The example seems better now IMHO. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |