[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.5 v3] Xen and tools: Fix listing of vcpus when domains lacking any vcpus exist



On 10/21/14 12:46, Andrew Cooper wrote:
On a system which looks like this:

[root@st04 ~]# xl list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   752     4     r-----   46699.3
(null)                                       1     0     0     --p---       0.0
(null)                                       2     0     0     --p---       0.0
(null)                                       3     0     0     --p---       0.0
badger                                      25     0     1     --p---       0.0

`xl vcpu-list` failes as so:

[root@st04 ~]# xl vcpu-list
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
Domain-0                             0     0    0   -b-   12171.0  all
Domain-0                             0     1    1   -b-   11779.6  all
Domain-0                             0     2    2   -b-   11599.0  all
Domain-0                             0     3    3   r--   11007.0  all
libxl: critical: libxl__calloc: libxl: FATAL ERROR: memory allocation failure 
(libxl__calloc, 4294935299 x 40)
: Cannot allocate memory
libxl: FATAL ERROR: memory allocation failure (libxl__calloc, 4294935299 x 40)

The root cause of this is in Xen.  getdomaininfo() has no way of expressing
"this domain has no vcpus".  Previously, info->max_vcpu_id would be returned
uninitialised in such a case.

Unfortunately, setting it to 0 as a default is not appropriate.  A max_vcpu_id
of 0 and nr_online_cpus of 0 is the valid state for a single vcpu domain which
is in the process of being destroyed.

As all components are required to add 1 to max_vcpu_id to get the number of
vcpus, an id of ~0U is not valid to be used.  Explicitly define this as an
invalid max vcpu value, and use it to express "no vcpus" in getdomaininfo()

In libxl, the issue is seen as libxl_list_vcpu() attempts to use the
uninitialised domaininfo.max_vcpu_id for memory allocation.

Check domaininfo.max_vcpu_id against the new sentinel value
XEN_INVALID_MAX_VCPU_ID, and return early.  This means that it is now valid
for libxl_list_vcpu() to return NULL for a domain which lacks any vcpus.

As part of this change, remove the pointless call to libxl_get_max_cpus(),
whose returned value is unconditionally clobbered in the for() loop.

Reported-by: Euan Harris <euan.harris@xxxxxxxxxx>
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
CC: Keir Fraser <keir@xxxxxxx>
CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
CC: Wei Liu <wei.liu2@xxxxxxxxxx>

---
v3:
  * Merge the two patches together, as the first turns a likely toolstack crash
    into an almost-certain one.
  * Don't bother setting nr_cpus_out.
v2:
  * s/nr_online_cpus/max_vcpu_id/g - I accidentally sent an unrefreshed patch

Looks good to me.

Reviewed-by: Don Slutz <dslutz@xxxxxxxxxxx>

   -Don Slutz


---
  tools/libxl/libxl.c         |    7 ++++++-
  tools/libxl/xl_cmdimpl.c    |    4 +---
  xen/common/domctl.c         |    1 +
  xen/include/public/domctl.h |    1 +
  4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f27b581..ee127d8 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5246,7 +5246,12 @@ libxl_vcpuinfo *libxl_list_vcpu(libxl_ctx *ctx, uint32_t 
domid,
          GC_FREE;
          return NULL;
      }
-    *nr_cpus_out = libxl_get_max_cpus(ctx);
+
+    if (domaininfo.max_vcpu_id == XEN_INVALID_MAX_VCPU_ID) {
+        GC_FREE;
+        return NULL;
+    }
+
      ret = ptr = libxl__calloc(NOGC, domaininfo.max_vcpu_id + 1,
                                sizeof(libxl_vcpuinfo));
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index ea43761..0882644 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4813,10 +4813,8 @@ static void print_domain_vcpuinfo(uint32_t domid, 
uint32_t nr_cpus)
vcpuinfo = libxl_list_vcpu(ctx, domid, &nb_vcpu, &nrcpus); - if (!vcpuinfo) {
-        fprintf(stderr, "libxl_list_vcpu failed.\n");
+    if (!vcpuinfo)
          return;
-    }
for (i = 0; i < nb_vcpu; i++) {
          print_vcpuinfo(domid, &vcpuinfo[i], nr_cpus);
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index b6f9708..d9c2635 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -156,6 +156,7 @@ void getdomaininfo(struct domain *d, struct 
xen_domctl_getdomaininfo *info)
      struct vcpu_runstate_info runstate;
info->domain = d->domain_id;
+    info->max_vcpu_id = XEN_INVALID_MAX_VCPU_ID;
      info->nr_online_vcpus = 0;
      info->ssidref = 0;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index f519524..58b19e7 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -108,6 +108,7 @@ struct xen_domctl_getdomaininfo {
      uint64_aligned_t shared_info_frame; /* GMFN of shared_info struct */
      uint64_aligned_t cpu_time;
      uint32_t nr_online_vcpus;    /* Number of VCPUs currently online. */
+#define XEN_INVALID_MAX_VCPU_ID (~0U) /* Domain has no vcpus? */
      uint32_t max_vcpu_id;        /* Maximum VCPUID in use by this domain. */
      uint32_t ssidref;
      xen_domain_handle_t handle;


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.