[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v13 4/5] libxl/xl: make it possible to specify soft-affinity in domain config file



To do so, we add the vcpu_soft_affinity array to build_info, and
treat it much like vcpu_hard_affinity. The new config option is
called "cpus_soft".

Note that the vcpu_hard_affinity array, introduced in a previous
patch, and the vcpu_soft_affinity array, introduced here, share
the same LIBXL_HAVE_xxx macro, in libxl.h. That is called
LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS, and was introduced
together with vcpu_hard_affinity, but only inside a comment.
In this change, we uncomment, and hence properly define it.

In order to avoid having to issue separate calls to
libxl_set_vcpuaffinity() (one for hard affinity and one for soft
affinity) in libxl__build_pre(), in case the caller uses
b_info->cpumap (for the former) and b_info->vcpu_soft_affinity (for
the latter), we also set (again!) a new default for b_info->cpumap.
This allows, from this change on, to always and only deal with
b_info->vcpu_hard_affinity all around libxl internals.

Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
---
Changes from v10:
 * default for b_info->cpumap changed again. Since we are not
   formally deprecating it, it is not just a corner case that one
   would want to specify hard affinity via cpumap, and soft affinity
   via the vcpu_soft_affinity array. That would require two distinct
   calls to libxl_set_vcpuaffinity(), which is something we rather
   avoid. To do so, if cpumap is used, convert it to vcpu_hard_affinity
   in *_build_info_setdefault() and then forget about it!
   This solution came up already as a possibility during v9's review,
   but at the time I thought it were not worth, as we were deprecating
   cpumap anyway. Since we're not, that is now the best solution IMO.

Changes from v9:
 * patch reworked again, due to changes in the preceding ones
   in the series. The structure is similar, it's still based
   on adding some indirection, so that the same code can be
   used to pars and enact both hard and soft affinity, but
   the code did change, I'm afraid.

Changes from v8:
 * fix a type in the LIBXL_HAVE_xxx macro name.

Changes from v7:
 * WARNING: this patch underwent quite a fundamental rework,
   given it's now building on top of Wei's "push vcpu affinity
   to libxl" patch. That's why I think it should be re-reviewed
   almost from scratch (sorry! :-P), and that's why I did not
   add IanC's ack, although he provided it to the v7 version
   of it.

Changes from v6:
 * update and improve the changelog.

Changes from v4:
 * fix typos and rephrase docs, as suggested during review;
 * more refactoring, i.e., more addressing factor of potential
   common code, as requested during review.

Changes from v3:
 * fix typos and language issues in docs and comments, as
   suggested during review;
 * common code to soft and hard affinity parsing factored
   together, as requested uring review.

Changes from v2:
 * use the new libxl API. Although the implementation changed
   only a little bit, I removed IanJ's Acked-by, although I am
   here saying that he did provided it, as requested.
---
 docs/man/xl.cfg.pod.5       |   23 ++++++++++--
 tools/libxl/libxl.h         |    3 +-
 tools/libxl/libxl_create.c  |   15 ++++++++
 tools/libxl/libxl_dom.c     |   33 ++++++++++++-----
 tools/libxl/libxl_types.idl |    1 +
 tools/libxl/xl_cmdimpl.c    |   84 ++++++++++++++++++++++++++-----------------
 6 files changed, 111 insertions(+), 48 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index ffd94a8..5833054 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -156,19 +156,36 @@ for each element of the list.
 =back
 
 If this option is not specified, no vcpu to cpu pinning is established,
-and the vcpus of the guest can run on all the cpus of the host.
+and the vcpus of the guest can run on all the cpus of the host. If this
+option is specified, the intersection of the vcpu pinning mask, provided
+here, and the soft affinity mask, provided via B<cpus\_soft=> (if any),
+is utilized to compute the domain node-affinity, for driving memory
+allocations.
 
 If we are on a NUMA machine (i.e., if the host has more than one NUMA
 node) and this option is not specified, libxl automatically tries to
 place the guest on the least possible number of nodes. That, however,
 will not affect vcpu pinning, so the guest will still be able to run on
-all the cpus, it will just prefer the ones from the node it has been
-placed on. A heuristic approach is used for choosing the best node (or
+all the cpus. A heuristic approach is used for choosing the best node (or
 set of nodes), with the goals of maximizing performance for the guest
 and, at the same time, achieving efficient utilization of host cpus
 and memory. See F<docs/misc/xl-numa-placement.markdown> for more
 details.
 
+=item B<cpus_soft="CPU-LIST">
+
+Exactly as B<cpus=>, but specifies soft affinity, rather than pinning
+(hard affinity). When using the credit scheduler, this means what cpus
+the vcpus of the domain prefer.
+
+A C<CPU-LIST> is specified exactly as above, for B<cpus=>.
+
+If this option is not specified, the vcpus of the guest will not have
+any preference regarding on what cpu to run. If this option is specified,
+the intersection of the soft affinity mask, provided here, and the vcpu
+pinning, provided via B<cpus=> (if any), is utilized to compute the
+domain node-affinity, for driving memory allocations.
+
 =back
 
 =head3 CPU Scheduling
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5ae6532..bfeb3bc 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -380,8 +380,7 @@ typedef struct libxl__ctx libxl_ctx;
  * Each bitmap should be big enough to accommodate the maximum number of
  * PCPUs of the host.
  */
-/* to be uncommented when soft array added */
-/* #define LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS 1 */
+#define LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS 1
 
 /*
  * LIBXL_HAVE_BUILDINFO_USBDEVICE_LIST
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 0686f96..13992b4 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -187,6 +187,21 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     } else if (b_info->avail_vcpus.size > HVM_MAX_VCPUS)
         return ERROR_FAIL;
 
+    /* In libxl internals, we want to deal with vcpu_hard_affinity only! */
+    if (b_info->cpumap.size && !b_info->num_vcpu_hard_affinity) {
+        int i;
+
+        b_info->vcpu_hard_affinity = libxl__calloc(gc, b_info->max_vcpus,
+                                                   sizeof(libxl_bitmap));
+        for (i = 0; i < b_info->max_vcpus; i++) {
+            if (libxl_cpu_bitmap_alloc(CTX, &b_info->vcpu_hard_affinity[i], 0))
+                return ERROR_FAIL;
+            libxl_bitmap_copy(CTX, &b_info->vcpu_hard_affinity[i],
+                              &b_info->cpumap);
+        }
+        b_info->num_vcpu_hard_affinity = b_info->max_vcpus;
+    }
+
     libxl_defbool_setdefault(&b_info->numa_placement, true);
 
     if (b_info->max_memkb == LIBXL_MEMKB_DEFAULT)
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 83eb29a..cfbd13d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -272,21 +272,36 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
     if (info->nodemap.size)
         libxl_domain_set_nodeaffinity(ctx, domid, &info->nodemap);
     /* As mentioned in libxl.h, vcpu_hard_array takes precedence */
-    if (info->num_vcpu_hard_affinity) {
-        int i;
+    if (info->num_vcpu_hard_affinity || info->num_vcpu_soft_affinity) {
+        libxl_bitmap *hard_affinity, *soft_affinity;
+        int i, n_vcpus;
+
+        n_vcpus = info->num_vcpu_hard_affinity > info->num_vcpu_soft_affinity ?
+            info->num_vcpu_hard_affinity : info->num_vcpu_soft_affinity;
+
+        for (i = 0; i < n_vcpus; i++) {
+            /*
+             * Prepare hard and soft affinity pointers in a way that allows
+             * us to issue only one call to libxl_set_vcpuaffinity(), setting,
+             * for each vcpu, both hard and soft affinity "atomically".
+             */
+            hard_affinity = NULL;
+            if (info->num_vcpu_hard_affinity &&
+                i < info->num_vcpu_hard_affinity)
+                hard_affinity = &info->vcpu_hard_affinity[i];
+
+            soft_affinity = NULL;
+            if (info->num_vcpu_soft_affinity &&
+                i < info->num_vcpu_soft_affinity)
+                soft_affinity = &info->vcpu_soft_affinity[i];
 
-        for (i = 0; i < info->num_vcpu_hard_affinity; i++) {
             if (libxl_set_vcpuaffinity(ctx, domid, i,
-                                       &info->vcpu_hard_affinity[i],
-                                       NULL)) {
+                                       hard_affinity, soft_affinity)) {
                 LOG(ERROR, "setting affinity failed on vcpu `%d'", i);
                 return ERROR_FAIL;
             }
         }
-    } else if (info->cpumap.size)
-        libxl_set_vcpuaffinity_all(ctx, domid, info->max_vcpus,
-                                   &info->cpumap, NULL);
-
+    }
 
     if (xc_domain_setmaxmem(ctx->xch, domid, info->target_memkb +
         LIBXL_MAXMEM_CONSTANT) < 0) {
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index a412f9c..0b3496f 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -319,6 +319,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("cpumap",          libxl_bitmap),
     ("nodemap",         libxl_bitmap),
     ("vcpu_hard_affinity", Array(libxl_bitmap, "num_vcpu_hard_affinity")),
+    ("vcpu_soft_affinity", Array(libxl_bitmap, "num_vcpu_soft_affinity")),
     ("numa_placement",  libxl_defbool),
     ("tsc_mode",        libxl_tsc_mode),
     ("max_memkb",       MemKB),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 6b91f76..f1c136a 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -691,67 +691,76 @@ static void parse_top_level_sdl_options(XLU_Config 
*config,
     xlu_cfg_replace_string (config, "xauthority", &sdl->xauthority, 0);
 }
 
-static void parse_vcpu_affinity(XLU_Config *config,
-                                libxl_domain_build_info *b_info)
+static void parse_vcpu_affinity(libxl_domain_build_info *b_info,
+                                XLU_ConfigList *cpus, const char *buf,
+                                int num_cpus, bool is_hard)
 {
-    XLU_ConfigList *cpus;
-    const char *buf;
-    int num_cpus;
+    libxl_bitmap *vcpu_affinity_array;
 
-    if (!xlu_cfg_get_list (config, "cpus", &cpus, &num_cpus, 1)) {
-        int j = 0;
+    /*
+     * If we are here, and buf is !NULL, we're dealing with a string. What
+     * we do in this case is parse it, and copy the result in _all_ (up to
+     * b_info->max_vcpus) the elements of the vcpu affinity array.
+     *
+     * If buf is NULL, we have a list, and what we do is putting in the
+     * i-eth element of the vcpu affinity array the result of the parsing
+     * of the i-eth entry of the list. If there are more vcpus than
+     * entries, it is fine to just not touch the last array elements.
+     */
 
-        /* Silently ignore values corresponding to non existing vcpus */
-        if (num_cpus > b_info->max_vcpus)
-            num_cpus = b_info->max_vcpus;
+    /* Silently ignore values corresponding to non existing vcpus */
+    if (buf || num_cpus > b_info->max_vcpus)
+        num_cpus = b_info->max_vcpus;
 
+    if (is_hard) {
+        b_info->num_vcpu_hard_affinity = num_cpus;
         b_info->vcpu_hard_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap));
+        vcpu_affinity_array = b_info->vcpu_hard_affinity;
+    } else {
+        b_info->num_vcpu_soft_affinity = num_cpus;
+        b_info->vcpu_soft_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap));
+        vcpu_affinity_array = b_info->vcpu_soft_affinity;
+    }
+
+    if (!buf) {
+        int j = 0;
 
         while ((buf = xlu_cfg_get_listitem(cpus, j)) != NULL && j < num_cpus) {
-            libxl_bitmap_init(&b_info->vcpu_hard_affinity[j]);
-            if (libxl_cpu_bitmap_alloc(ctx,
-                                       &b_info->vcpu_hard_affinity[j], 0)) {
+            libxl_bitmap_init(&vcpu_affinity_array[j]);
+            if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[j], 0)) {
                 fprintf(stderr, "Unable to allocate cpumap for vcpu %d\n", j);
                 exit(1);
             }
 
-            if (vcpupin_parse(buf, &b_info->vcpu_hard_affinity[j]))
+            if (vcpupin_parse(buf, &vcpu_affinity_array[j]))
                 exit(1);
 
             j++;
         }
-        b_info->num_vcpu_hard_affinity = num_cpus;
 
         /* We have a list of cpumaps, disable automatic placement */
         libxl_defbool_set(&b_info->numa_placement, false);
-    }
-    else if (!xlu_cfg_get_string (config, "cpus", &buf, 0)) {
+    } else {
         int i;
 
-        b_info->vcpu_hard_affinity =
-            xmalloc(b_info->max_vcpus * sizeof(libxl_bitmap));
-
-        libxl_bitmap_init(&b_info->vcpu_hard_affinity[0]);
-        if (libxl_cpu_bitmap_alloc(ctx,
-                                   &b_info->vcpu_hard_affinity[0], 0)) {
+        libxl_bitmap_init(&vcpu_affinity_array[0]);
+        if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[0], 0)) {
             fprintf(stderr, "Unable to allocate cpumap for vcpu 0\n");
             exit(1);
         }
 
-        if (vcpupin_parse(buf, &b_info->vcpu_hard_affinity[0]))
+        if (vcpupin_parse(buf, &vcpu_affinity_array[0]))
             exit(1);
 
         for (i = 1; i < b_info->max_vcpus; i++) {
-            libxl_bitmap_init(&b_info->vcpu_hard_affinity[i]);
-            if (libxl_cpu_bitmap_alloc(ctx,
-                                       &b_info->vcpu_hard_affinity[i], 0)) {
+            libxl_bitmap_init(&vcpu_affinity_array[i]);
+            if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[i], 0)) {
                 fprintf(stderr, "Unable to allocate cpumap for vcpu %d\n", i);
                 exit(1);
             }
-            libxl_bitmap_copy(ctx, &b_info->vcpu_hard_affinity[i],
-                              &b_info->vcpu_hard_affinity[0]);
+            libxl_bitmap_copy(ctx, &vcpu_affinity_array[i],
+                              &vcpu_affinity_array[0]);
         }
-        b_info->num_vcpu_hard_affinity = b_info->max_vcpus;
 
         libxl_defbool_set(&b_info->numa_placement, false);
     }
@@ -765,9 +774,9 @@ static void parse_config_data(const char *config_source,
     const char *buf;
     long l;
     XLU_Config *config;
-    XLU_ConfigList *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms;
+    XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms;
     XLU_ConfigList *ioports, *irqs, *iomem;
-    int num_ioports, num_irqs, num_iomem;
+    int num_ioports, num_irqs, num_iomem, num_cpus;
     int pci_power_mgmt = 0;
     int pci_msitranslate = 0;
     int pci_permissive = 0;
@@ -864,8 +873,15 @@ static void parse_config_data(const char *config_source,
     if (!xlu_cfg_get_long (config, "maxvcpus", &l, 0))
         b_info->max_vcpus = l;
 
-    /* Figure out VCPU hard-affinity ("cpus" config option) */
-    parse_vcpu_affinity(config, b_info);
+    buf = NULL;
+    if (!xlu_cfg_get_list (config, "cpus", &cpus, &num_cpus, 1) ||
+        !xlu_cfg_get_string (config, "cpus", &buf, 0))
+        parse_vcpu_affinity(b_info, cpus, buf, num_cpus, /* is_hard */ true);
+
+    buf = NULL;
+    if (!xlu_cfg_get_list (config, "cpus_soft", &cpus, &num_cpus, 1) ||
+        !xlu_cfg_get_string (config, "cpus_soft", &buf, 0))
+        parse_vcpu_affinity(b_info, cpus, buf, num_cpus, false);
 
     if (!xlu_cfg_get_long (config, "memory", &l, 0)) {
         b_info->max_memkb = l * 1024;


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.