[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH 10 of 10 v2] Some automatic NUMA placement documentation

About rationale, usage and (some small bits of) API.

Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Changes from v1:
 * API documentation moved close to the actual functions.

diff --git a/docs/misc/xl-numa-placement.markdown 
new file mode 100644
--- /dev/null
+++ b/docs/misc/xl-numa-placement.markdown
@@ -0,0 +1,74 @@
+# Guest Automatic NUMA Placement in libxl and xl #
+## Rationale ##
+The Xen hypervisor deals with Non-Uniform Memory Access (NUMA])
+machines by assigning to its domain a "node affinity", i.e., a set of NUMA
+nodes of the host from which it gets its memory allocated.
+NUMA awareness becomes very important as soon as many domains start running
+memory-intensive workloads on a shared host. In fact, the cost of accessing
+non node-local memory locations is very high, and the performance degradation
+is likely to be noticeable.
+## Guest Placement in xl ##
+If using xl for creating and managing guests, it is very easy to ask
+for both manual or automatic placement of them across the host's NUMA
+Note that xm/xend does the very same thing, the only differences residing
+in the details of the heuristics adopted for the placement (see below).
+### Manual Guest Placement with xl ###
+Thanks to the "cpus=" option, it is possible to specify where a domain
+should be created and scheduled on, directly in its config file. This
+affects NUMA placement and memory accesses as the hypervisor constructs
+the node affinity of a VM basing right on its CPU affinity when it is
+This is very simple and effective, but requires the user/system
+administrator to explicitly specify affinities for each and every domain,
+or Xen won't be able to enable guarantee the locality for their memory
+### Automatic Guest Placement with xl ###
+In case no "cpus=" option is specified in the config file, xl tries
+to figure out on its own on which node(s) the domain could fit best.
+First of all, it needs to find a node (or a set of nodes) that have
+enough free memory for accommodating the domain. After that, the actual
+decision on where to put the new guest happens by generating all the
+possible combinations of nodes that satisfies the above and chose among
+them according to the following heuristics:
+  *  candidates involving fewer nodes come first. In case two (or more)
+     candidates span the same number of nodes,
+  *  candidates with greater amount of free memory come first. In case
+     two (or more) candidates differ in their amount of free memory by
+     less than 10%,
+  *  candidates with fewer domains already placed on them come first.
+Giving preference to small candidates ensures better performance for
+the guest, as it avoid spreading its memory among different nodes.
+Using the nodes that have the biggest amounts of free memory helps
+keeping the memory fragmentation small, from a system wide perspective.
+Finally, in case more candidates fulfil these criteria by the same
+extent, choosing the candidate that is hosting fewer domain helps
+balancing the load on the various nodes.
+The last step is figuring out whether the selected candidate contains
+at least as much CPUs as the number of VCPUs of the VM. The current
+solution for the case when this is not verified is just to add some
+more nodes, until the condition turns into being true. When doing
+this, the nodes with the least possible distance from the ones
+already in the nodemap are considered.
+## Guest Placement within libxl ##
+xl achieves automatic NUMA placement by means of the following API
+calls, provided by libxl.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.