Xen project Mailing List

[Xen-devel] [PATCH 10 of 10 v2] Some automatic NUMA placement documentation

From: Dario Faggioli <raistlin@xxxxxxxx>

Date: Fri, 15 Jun 2012 19:04:38 +0200

Cc: Andre Przywara <andre.przywara@xxxxxxx>, Ian Campbell <Ian.Campbell@xxxxxxxxxx>, Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>

Delivery-date: Fri, 15 Jun 2012 17:05:52 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

About rationale, usage and (some small bits of) API. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> Changes from v1: * API documentation moved close to the actual functions. diff --git a/docs/misc/xl-numa-placement.markdown b/docs/misc/xl-numa-placement.markdown new file mode 100644 --- /dev/null +++ b/docs/misc/xl-numa-placement.markdown @@ -0,0 +1,74 @@ +# Guest Automatic NUMA Placement in libxl and xl # + +## Rationale ## + +The Xen hypervisor deals with Non-Uniform Memory Access (NUMA]) +machines by assigning to its domain a "node affinity", i.e., a set of NUMA +nodes of the host from which it gets its memory allocated. + +NUMA awareness becomes very important as soon as many domains start running +memory-intensive workloads on a shared host. In fact, the cost of accessing +non node-local memory locations is very high, and the performance degradation +is likely to be noticeable. + +## Guest Placement in xl ## + +If using xl for creating and managing guests, it is very easy to ask +for both manual or automatic placement of them across the host's NUMA +nodes. + +Note that xm/xend does the very same thing, the only differences residing +in the details of the heuristics adopted for the placement (see below). + +### Manual Guest Placement with xl ### + +Thanks to the "cpus=" option, it is possible to specify where a domain +should be created and scheduled on, directly in its config file. This +affects NUMA placement and memory accesses as the hypervisor constructs +the node affinity of a VM basing right on its CPU affinity when it is +created. + +This is very simple and effective, but requires the user/system +administrator to explicitly specify affinities for each and every domain, +or Xen won't be able to enable guarantee the locality for their memory +accesses. + +### Automatic Guest Placement with xl ### + +In case no "cpus=" option is specified in the config file, xl tries +to figure out on its own on which node(s) the domain could fit best. + +First of all, it needs to find a node (or a set of nodes) that have +enough free memory for accommodating the domain. After that, the actual +decision on where to put the new guest happens by generating all the +possible combinations of nodes that satisfies the above and chose among +them according to the following heuristics: + + * candidates involving fewer nodes come first. In case two (or more) + candidates span the same number of nodes, + * candidates with greater amount of free memory come first. In case + two (or more) candidates differ in their amount of free memory by + less than 10%, + * candidates with fewer domains already placed on them come first. + +Giving preference to small candidates ensures better performance for +the guest, as it avoid spreading its memory among different nodes. +Using the nodes that have the biggest amounts of free memory helps +keeping the memory fragmentation small, from a system wide perspective. +Finally, in case more candidates fulfil these criteria by the same +extent, choosing the candidate that is hosting fewer domain helps +balancing the load on the various nodes. + +The last step is figuring out whether the selected candidate contains +at least as much CPUs as the number of VCPUs of the VM. The current +solution for the case when this is not verified is just to add some +more nodes, until the condition turns into being true. When doing +this, the nodes with the least possible distance from the ones +already in the nodemap are considered. + +## Guest Placement within libxl ## + +xl achieves automatic NUMA placement by means of the following API +calls, provided by libxl. + + _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.