[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document
On Sat, Sep 23, 2017 at 09:48:10AM +0000, Yi Sun wrote: > This patch creates MBA feature document in doc/features/. It describes > key points to implement MBA which is described in details in Intel SDM ^ detail > "Introduction to Memory Bandwidth Allocation". > > Signed-off-by: Yi Sun <yi.y.sun@xxxxxxxxxxxxxxx> Thanks, I think this is looking quite good IMHO. Just a couple of nits below. > --- > CC: Jan Beulich <jbeulich@xxxxxxxx> > CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > CC: Wei Liu <wei.liu2@xxxxxxxxxx> > CC: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > CC: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx> > CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> > CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > CC: Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx> > CC: Julien Grall <julien.grall@xxxxxxx> > > v4: > - add 'domain-name' as parameter of 'psr-mba-show/psr-mba-set'. > (suggested by Roger Pau Monné) > - fix some wordings. > (suggested by Roger Pau Monné) > - explain how user can know the MBA_MAX. > (suggested by Roger Pau Monné) > - move the description of 'Linear mode/Non-linear mode' into section > of 'psr-mba-show'. > (suggested by Roger Pau Monné) > - change 'per-thread' to 'per-hyper-thread' to make it clearer. > (suggested by Roger Pau Monné) > - upgrade revision number. > v3: > - remove 'closed-loop' related description. > (suggested by Roger Pau Monné) > - explain 'linear' and 'non-linear' before mentioning them. > (suggested by Roger Pau Monné) > - adjust desription of 'psr-mba-set'. > (suggested by Roger Pau Monné) > - explain 'MBA_MAX'. > (suggested by Roger Pau Monné) > - remove 'n<64'. > (suggested by Roger Pau Monné) > - fix some wordings. > (suggested by Roger Pau Monné) > - add context in 'Testing' part to make things more clear. > (suggested by Roger Pau Monné) > v2: > - declare 'HW' in Terminology. > (suggested by Chao Peng) > - replace 'COS ID of VCPU' to 'COS ID of domain'. > (suggested by Chao Peng) > - replace 'COS register' to 'Thrtl MSR'. > (suggested by Chao Peng) > - add description for 'psr-mba-show' to state that the decimal value is > shown for linear mode but hexadecimal value is shown for non-linear > mode. > (suggested by Chao Peng) > - remove content in 'Areas for improvement'. > (suggested by Chao Peng) > - use '<>' to specify mandatory argument to a command. > (suggested by Wei Liu) > v1: > - remove a special character to avoid the error when building pandoc. > --- > docs/features/intel_psr_mba.pandoc | 291 > +++++++++++++++++++++++++++++++++++++ > 1 file changed, 291 insertions(+) > create mode 100644 docs/features/intel_psr_mba.pandoc > > diff --git a/docs/features/intel_psr_mba.pandoc > b/docs/features/intel_psr_mba.pandoc > new file mode 100644 > index 0000000..7a6a588 > --- /dev/null > +++ b/docs/features/intel_psr_mba.pandoc > @@ -0,0 +1,291 @@ > +% Intel Memory Bandwidth Allocation (MBA) Feature > +% Revision 1.6 > + > +\clearpage > + > +# Basics > + > +---------------- ---------------------------------------------------- > + Status: **Tech Preview** > + > +Architecture(s): Intel x86 > + > + Component(s): Hypervisor, toolstack > + > + Hardware: MBA is supported on Skylake Server and beyond > +---------------- ---------------------------------------------------- > + > +# Terminology > + > +* CAT Cache Allocation Technology > +* CBM Capacity BitMasks > +* CDP Code and Data Prioritization > +* COS/CLOS Class of Service > +* HW Hardware > +* MBA Memory Bandwidth Allocation > +* MSRs Machine Specific Registers > +* PSR Intel Platform Shared Resource > +* THRTL Throttle value or delay value > + > +# Overview > + > +The Memory Bandwidth Allocation (MBA) feature provides indirect and > approximate > +control over memory bandwidth available per-core. This feature provides OS/ > +hypervisor the ability to slow misbehaving apps/domains by using a > credit-based > +throttling mechanism. > + > +# User details > + > +* Feature Enabling: > + > + Add "psr=mba" to boot line parameter to enable MBA feature. > + > +* xl interfaces: > + > + 1. `psr-mba-show [domain-id|domain-name]`: > + > + Show memory bandwidth throttling for domain. Under different modes, it > + shows different type of data. > + > + There are two modes: > + Linear mode: the input precision is defined as 100-(MBA_MAX). For > instance, > + if the MBA_MAX value is 90, the input precision is 10%. Values not an > even > + multiple of the precision (e.g., 12%) will be rounded down (e.g., to 10% > + delay applied) by HW automatically. The response of throttling value is > + linear. > + > + Non-linear mode: input delay values are powers-of-two from zero to the > + MBA_MAX value from CPUID. In this case any values not a power of two > will > + be rounded down the next nearest power of two by HW automatically. The > + response of throttling value is non-linear. > + > + For linear mode, it shows the decimal value. For non-linear mode, it > shows > + hexadecimal value. > + > + 2. `psr-mba-set [OPTIONS] <domain-id|domain-name> <throttling>`: > + > + Set memory bandwidth throttling for domain. > + > + Options: > + '-s': Specify the socket to process, otherwise all sockets are > processed. > + > + Throttling value set in register implies the approximate amount of > delaying > + the traffic between core and memory. The higher throttling value > results in ^ remove 'The' ^ result > + lower bandwidth. The max throttling value (MBA_MAX) supported can be > + obtained through CPUID inside hypervisor. User can know it through "Users can fetch the MBA_MAX value using the `psr-hwinfo` xl command." > + `psr-hwinfo`. > + > +# Technical details > + > +MBA is a member of Intel PSR features, it shares the base PSR infrastructure > +in Xen. > + > +## Hardware perspective > + > + MBA defines a range of MSRs to support specifying a delay value (Thrtl) per > + COS, with details below. > + > + ``` > + +----------------------------+----------------+ > + | MSR (per socket) | Address | > + +----------------------------+----------------+ > + | IA32_L2_QOS_Ext_BW_Thrtl_0 | 0xD50 | > + +----------------------------+----------------+ > + | ... | ... | > + +----------------------------+----------------+ > + | IA32_L2_QOS_Ext_BW_Thrtl_n | 0xD50+n | > + +----------------------------+----------------+ > + ``` > + > + When context switch happens, the COS ID of domain is written to per-hyper- > + thread MSR `IA32_PQR_ASSOC`, and then hardware enforces bandwidth > allocation > + according to the throttling value stored in the Thrtl MSR register. > + > +## The relationship between MBA and CAT/CDP > + > + Generally speaking, MBA is completely independent of CAT/CDP, and any > + combination may be applied at any time, e.g. enabling MBA with CAT > + disabled. > + > + But it needs to be noticed that MBA shares COS infrastructure with CAT, > + although MBA is enumerated by different CPUID leaf from CAT (which > + indicates that the max COS of MBA may be different from CAT). In some > + cases, a domain is permitted to have a COS that is beyond one (or more) > + of PSR features but within the others. For instance, let's assume the max > + COS of MBA is 8 but the max COS of L3 CAT is 16, when a domain is assigned > + 9 as COS, the L3 CAT CBM associated to COS 9 would be enforced, but for > MBA, > + the HW works as default value is set since COS 9 is beyond the max COS (8) > + of MBA. > + > +## Design Overview > + > +* Core COS/Thrtl association > + > + When enforcing Memory Bandwidth Allocation, all cores of domains have > + the same default Thrtl MSR (COS0) which stores the same Thrtl (0). The > + default Thrtl MSR is used only in hypervisor and is transparent to tool > stack > + and user. > + > + System administrators can change PSR allocation policy at runtime by > + using the tool stack. Since MBA shares COS ID with CAT/CDP, a COS ID > + corresponds to a 2-tuple, like [CBM, Thrtl] with only-CAT enabled, when CDP > + is enabled, the COS ID corresponds to a 3-tuple, like [Code_CBM, Data_CBM, > + Thrtl]. If neither CAT nor CDP is enabled, things are easier, since one COS > + ID corresponds to one Thrtl. I find the above paragraph a little bit difficult to parse, although I'm not going to force you to re-write it. > + > +* VCPU schedule > + > + This part reuses CAT COS infrastructure. > + > +* Multi-sockets > + > + Different sockets may have different MBA ability (like max COS) ^ capabilities? [...] > +# Testing > + > +We can execute these commands to verify MBA on different HWs supporting them. > + > +For example: > + 1. User can get the MBA hardware info through 'psr-hwinfo' command. From > + result, user can know if this hardware works under linear mode or non- > + linear mode, the max throttling value (MBA_MAX) and so on. > + > + root@:~$ xl psr-hwinfo --mba > + Memory Bandwidth Allocation (MBA): > + Socket ID : 0 > + Linear Mode : Enabled > + Maximum COS : 7 > + Maximum Throttling Value: 90 > + Default Throttling Value: 0 > + > + 2. Then, user can set a throttling value to a domain. For example, set > '0xa', > + i.e 10% delay. > + > + root@:~$ xl psr-mba-set 1 0xa I would write this as 10 instead of 0xa, ie: $ xl psr-mba-set 1 10 I think it's clearer because MBA is in linear mode, so the values returned from xl will be in decimal base rather than hexadecimal. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |