Xen project Mailing List

[Xen-devel] [PATCH v13 4/5] libxl/xl: make it possible to specify soft-affinity in domain config file

From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Tue, 29 Jul 2014 18:06:52 +0200

Cc: Ian.Jackson@xxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Campbell <Ian.Campbell@xxxxxxxxxx>

Delivery-date: Tue, 29 Jul 2014 16:07:05 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

To do so, we add the vcpu_soft_affinity array to build_info, and treat it much like vcpu_hard_affinity. The new config option is called "cpus_soft". Note that the vcpu_hard_affinity array, introduced in a previous patch, and the vcpu_soft_affinity array, introduced here, share the same LIBXL_HAVE_xxx macro, in libxl.h. That is called LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS, and was introduced together with vcpu_hard_affinity, but only inside a comment. In this change, we uncomment, and hence properly define it. In order to avoid having to issue separate calls to libxl_set_vcpuaffinity() (one for hard affinity and one for soft affinity) in libxl__build_pre(), in case the caller uses b_info->cpumap (for the former) and b_info->vcpu_soft_affinity (for the latter), we also set (again!) a new default for b_info->cpumap. This allows, from this change on, to always and only deal with b_info->vcpu_hard_affinity all around libxl internals. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx> --- Changes from v10: * default for b_info->cpumap changed again. Since we are not formally deprecating it, it is not just a corner case that one would want to specify hard affinity via cpumap, and soft affinity via the vcpu_soft_affinity array. That would require two distinct calls to libxl_set_vcpuaffinity(), which is something we rather avoid. To do so, if cpumap is used, convert it to vcpu_hard_affinity in *_build_info_setdefault() and then forget about it! This solution came up already as a possibility during v9's review, but at the time I thought it were not worth, as we were deprecating cpumap anyway. Since we're not, that is now the best solution IMO. Changes from v9: * patch reworked again, due to changes in the preceding ones in the series. The structure is similar, it's still based on adding some indirection, so that the same code can be used to pars and enact both hard and soft affinity, but the code did change, I'm afraid. Changes from v8: * fix a type in the LIBXL_HAVE_xxx macro name. Changes from v7: * WARNING: this patch underwent quite a fundamental rework, given it's now building on top of Wei's "push vcpu affinity to libxl" patch. That's why I think it should be re-reviewed almost from scratch (sorry! :-P), and that's why I did not add IanC's ack, although he provided it to the v7 version of it. Changes from v6: * update and improve the changelog. Changes from v4: * fix typos and rephrase docs, as suggested during review; * more refactoring, i.e., more addressing factor of potential common code, as requested during review. Changes from v3: * fix typos and language issues in docs and comments, as suggested during review; * common code to soft and hard affinity parsing factored together, as requested uring review. Changes from v2: * use the new libxl API. Although the implementation changed only a little bit, I removed IanJ's Acked-by, although I am here saying that he did provided it, as requested. --- docs/man/xl.cfg.pod.5 | 23 ++++++++++-- tools/libxl/libxl.h | 3 +- tools/libxl/libxl_create.c | 15 ++++++++ tools/libxl/libxl_dom.c | 33 ++++++++++++----- tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 84 ++++++++++++++++++++++++++----------------- 6 files changed, 111 insertions(+), 48 deletions(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index ffd94a8..5833054 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -156,19 +156,36 @@ for each element of the list. =back If this option is not specified, no vcpu to cpu pinning is established, -and the vcpus of the guest can run on all the cpus of the host. +and the vcpus of the guest can run on all the cpus of the host. If this +option is specified, the intersection of the vcpu pinning mask, provided +here, and the soft affinity mask, provided via B<cpus\_soft=> (if any), +is utilized to compute the domain node-affinity, for driving memory +allocations. If we are on a NUMA machine (i.e., if the host has more than one NUMA node) and this option is not specified, libxl automatically tries to place the guest on the least possible number of nodes. That, however, will not affect vcpu pinning, so the guest will still be able to run on -all the cpus, it will just prefer the ones from the node it has been -placed on. A heuristic approach is used for choosing the best node (or +all the cpus. A heuristic approach is used for choosing the best node (or set of nodes), with the goals of maximizing performance for the guest and, at the same time, achieving efficient utilization of host cpus and memory. See F<docs/misc/xl-numa-placement.markdown> for more details. +=item B<cpus_soft="CPU-LIST"> + +Exactly as B<cpus=>, but specifies soft affinity, rather than pinning +(hard affinity). When using the credit scheduler, this means what cpus +the vcpus of the domain prefer. + +A C<CPU-LIST> is specified exactly as above, for B<cpus=>. + +If this option is not specified, the vcpus of the guest will not have +any preference regarding on what cpu to run. If this option is specified, +the intersection of the soft affinity mask, provided here, and the vcpu +pinning, provided via B<cpus=> (if any), is utilized to compute the +domain node-affinity, for driving memory allocations. + =back =head3 CPU Scheduling diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 5ae6532..bfeb3bc 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -380,8 +380,7 @@ typedef struct libxl__ctx libxl_ctx; * Each bitmap should be big enough to accommodate the maximum number of * PCPUs of the host. */ -/* to be uncommented when soft array added */ -/* #define LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS 1 */ +#define LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS 1 /* * LIBXL_HAVE_BUILDINFO_USBDEVICE_LIST diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 0686f96..13992b4 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -187,6 +187,21 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, } else if (b_info->avail_vcpus.size > HVM_MAX_VCPUS) return ERROR_FAIL; + /* In libxl internals, we want to deal with vcpu_hard_affinity only! */ + if (b_info->cpumap.size && !b_info->num_vcpu_hard_affinity) { + int i; + + b_info->vcpu_hard_affinity = libxl__calloc(gc, b_info->max_vcpus, + sizeof(libxl_bitmap)); + for (i = 0; i < b_info->max_vcpus; i++) { + if (libxl_cpu_bitmap_alloc(CTX, &b_info->vcpu_hard_affinity[i], 0)) + return ERROR_FAIL; + libxl_bitmap_copy(CTX, &b_info->vcpu_hard_affinity[i], + &b_info->cpumap); + } + b_info->num_vcpu_hard_affinity = b_info->max_vcpus; + } + libxl_defbool_setdefault(&b_info->numa_placement, true); if (b_info->max_memkb == LIBXL_MEMKB_DEFAULT) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 83eb29a..cfbd13d 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -272,21 +272,36 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, if (info->nodemap.size) libxl_domain_set_nodeaffinity(ctx, domid, &info->nodemap); /* As mentioned in libxl.h, vcpu_hard_array takes precedence */ - if (info->num_vcpu_hard_affinity) { - int i; + if (info->num_vcpu_hard_affinity || info->num_vcpu_soft_affinity) { + libxl_bitmap *hard_affinity, *soft_affinity; + int i, n_vcpus; + + n_vcpus = info->num_vcpu_hard_affinity > info->num_vcpu_soft_affinity ? + info->num_vcpu_hard_affinity : info->num_vcpu_soft_affinity; + + for (i = 0; i < n_vcpus; i++) { + /* + * Prepare hard and soft affinity pointers in a way that allows + * us to issue only one call to libxl_set_vcpuaffinity(), setting, + * for each vcpu, both hard and soft affinity "atomically". + */ + hard_affinity = NULL; + if (info->num_vcpu_hard_affinity && + i < info->num_vcpu_hard_affinity) + hard_affinity = &info->vcpu_hard_affinity[i]; + + soft_affinity = NULL; + if (info->num_vcpu_soft_affinity && + i < info->num_vcpu_soft_affinity) + soft_affinity = &info->vcpu_soft_affinity[i]; - for (i = 0; i < info->num_vcpu_hard_affinity; i++) { if (libxl_set_vcpuaffinity(ctx, domid, i, - &info->vcpu_hard_affinity[i], - NULL)) { + hard_affinity, soft_affinity)) { LOG(ERROR, "setting affinity failed on vcpu `%d'", i); return ERROR_FAIL; } } - } else if (info->cpumap.size) - libxl_set_vcpuaffinity_all(ctx, domid, info->max_vcpus, - &info->cpumap, NULL); - + } if (xc_domain_setmaxmem(ctx->xch, domid, info->target_memkb + LIBXL_MAXMEM_CONSTANT) < 0) { diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index a412f9c..0b3496f 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -319,6 +319,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("cpumap", libxl_bitmap), ("nodemap", libxl_bitmap), ("vcpu_hard_affinity", Array(libxl_bitmap, "num_vcpu_hard_affinity")), + ("vcpu_soft_affinity", Array(libxl_bitmap, "num_vcpu_soft_affinity")), ("numa_placement", libxl_defbool), ("tsc_mode", libxl_tsc_mode), ("max_memkb", MemKB), diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 6b91f76..f1c136a 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -691,67 +691,76 @@ static void parse_top_level_sdl_options(XLU_Config *config, xlu_cfg_replace_string (config, "xauthority", &sdl->xauthority, 0); } -static void parse_vcpu_affinity(XLU_Config *config, - libxl_domain_build_info *b_info) +static void parse_vcpu_affinity(libxl_domain_build_info *b_info, + XLU_ConfigList *cpus, const char *buf, + int num_cpus, bool is_hard) { - XLU_ConfigList *cpus; - const char *buf; - int num_cpus; + libxl_bitmap *vcpu_affinity_array; - if (!xlu_cfg_get_list (config, "cpus", &cpus, &num_cpus, 1)) { - int j = 0; + /* + * If we are here, and buf is !NULL, we're dealing with a string. What + * we do in this case is parse it, and copy the result in _all_ (up to + * b_info->max_vcpus) the elements of the vcpu affinity array. + * + * If buf is NULL, we have a list, and what we do is putting in the + * i-eth element of the vcpu affinity array the result of the parsing + * of the i-eth entry of the list. If there are more vcpus than + * entries, it is fine to just not touch the last array elements. + */ - /* Silently ignore values corresponding to non existing vcpus */ - if (num_cpus > b_info->max_vcpus) - num_cpus = b_info->max_vcpus; + /* Silently ignore values corresponding to non existing vcpus */ + if (buf || num_cpus > b_info->max_vcpus) + num_cpus = b_info->max_vcpus; + if (is_hard) { + b_info->num_vcpu_hard_affinity = num_cpus; b_info->vcpu_hard_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap)); + vcpu_affinity_array = b_info->vcpu_hard_affinity; + } else { + b_info->num_vcpu_soft_affinity = num_cpus; + b_info->vcpu_soft_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap)); + vcpu_affinity_array = b_info->vcpu_soft_affinity; + } + + if (!buf) { + int j = 0; while ((buf = xlu_cfg_get_listitem(cpus, j)) != NULL && j < num_cpus) { - libxl_bitmap_init(&b_info->vcpu_hard_affinity[j]); - if (libxl_cpu_bitmap_alloc(ctx, - &b_info->vcpu_hard_affinity[j], 0)) { + libxl_bitmap_init(&vcpu_affinity_array[j]); + if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[j], 0)) { fprintf(stderr, "Unable to allocate cpumap for vcpu %d\n", j); exit(1); } - if (vcpupin_parse(buf, &b_info->vcpu_hard_affinity[j])) + if (vcpupin_parse(buf, &vcpu_affinity_array[j])) exit(1); j++; } - b_info->num_vcpu_hard_affinity = num_cpus; /* We have a list of cpumaps, disable automatic placement */ libxl_defbool_set(&b_info->numa_placement, false); - } - else if (!xlu_cfg_get_string (config, "cpus", &buf, 0)) { + } else { int i; - b_info->vcpu_hard_affinity = - xmalloc(b_info->max_vcpus * sizeof(libxl_bitmap)); - - libxl_bitmap_init(&b_info->vcpu_hard_affinity[0]); - if (libxl_cpu_bitmap_alloc(ctx, - &b_info->vcpu_hard_affinity[0], 0)) { + libxl_bitmap_init(&vcpu_affinity_array[0]); + if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[0], 0)) { fprintf(stderr, "Unable to allocate cpumap for vcpu 0\n"); exit(1); } - if (vcpupin_parse(buf, &b_info->vcpu_hard_affinity[0])) + if (vcpupin_parse(buf, &vcpu_affinity_array[0])) exit(1); for (i = 1; i < b_info->max_vcpus; i++) { - libxl_bitmap_init(&b_info->vcpu_hard_affinity[i]); - if (libxl_cpu_bitmap_alloc(ctx, - &b_info->vcpu_hard_affinity[i], 0)) { + libxl_bitmap_init(&vcpu_affinity_array[i]); + if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[i], 0)) { fprintf(stderr, "Unable to allocate cpumap for vcpu %d\n", i); exit(1); } - libxl_bitmap_copy(ctx, &b_info->vcpu_hard_affinity[i], - &b_info->vcpu_hard_affinity[0]); + libxl_bitmap_copy(ctx, &vcpu_affinity_array[i], + &vcpu_affinity_array[0]); } - b_info->num_vcpu_hard_affinity = b_info->max_vcpus; libxl_defbool_set(&b_info->numa_placement, false); } @@ -765,9 +774,9 @@ static void parse_config_data(const char *config_source, const char *buf; long l; XLU_Config *config; - XLU_ConfigList *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms; + XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms; XLU_ConfigList *ioports, *irqs, *iomem; - int num_ioports, num_irqs, num_iomem; + int num_ioports, num_irqs, num_iomem, num_cpus; int pci_power_mgmt = 0; int pci_msitranslate = 0; int pci_permissive = 0; @@ -864,8 +873,15 @@ static void parse_config_data(const char *config_source, if (!xlu_cfg_get_long (config, "maxvcpus", &l, 0)) b_info->max_vcpus = l; - /* Figure out VCPU hard-affinity ("cpus" config option) */ - parse_vcpu_affinity(config, b_info); + buf = NULL; + if (!xlu_cfg_get_list (config, "cpus", &cpus, &num_cpus, 1) || + !xlu_cfg_get_string (config, "cpus", &buf, 0)) + parse_vcpu_affinity(b_info, cpus, buf, num_cpus, /* is_hard */ true); + + buf = NULL; + if (!xlu_cfg_get_list (config, "cpus_soft", &cpus, &num_cpus, 1) || + !xlu_cfg_get_string (config, "cpus_soft", &buf, 0)) + parse_vcpu_affinity(b_info, cpus, buf, num_cpus, false); if (!xlu_cfg_get_long (config, "memory", &l, 0)) { b_info->max_memkb = l * 1024; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.