[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [libvirt] [PATCH V2] libxl: fix dom0 balloon logic



On Tue, Mar 24, 2015 at 02:43:42PM -0600, Jim Fehlig wrote:
Recent testing on large memory systems revealed a bug in the Xen xl
tool's freemem() function.  When autoballooning is enabled, freemem()
is used to ensure enough memory is available to start a domain,
ballooning dom0 if necessary.  When ballooning large amounts of memory
from dom0, freemem() would exceed its self-imposed wait time and
return an error.  Meanwhile, dom0 continued to balloon.  Starting the
domain later, after sufficient memory was ballooned from dom0, would
succeed.  The libvirt implementation in libxlDomainFreeMem() suffers
the same bug since it is modeled after freemem().

In the end, the best place to fix the bug on the Xen side was to
slightly change the behavior of libxl_wait_for_memory_target().
Instead of failing after caller-provided wait_sec, the function now
blocks as long as dom0 memory ballooning is progressing.  It will return
failure only when more memory is needed to reach the target and wait_sec
have expired with no progress being made.  See xen.git commit fd3aa246.
There was a dicussion on how this would affect other libxl apps like
libvirt

http://lists.xen.org/archives/html/xen-devel/2015-03/msg00739.html

If libvirt containing this patch was build against a Xen containing
the old libxl_wait_for_memory_target() behavior, libxlDomainFreeMem()
will fail after 30 sec and domain creation will be terminated.
Without this patch and with old libxl_wait_for_memory_target() behavior,
libxlDomainFreeMem() does not succeed after 30 sec, but returns success
anyway.  Domain creation continues resulting in all sorts of fun stuff
like cpu soft lockups in the guest OS.  It was decided to properly fix
libxl_wait_for_memory_target(), and if anything improve the default
behavior of apps using the freemem reference impl in xl.

xl was patched to accommodate the change in libxl_wait_for_memory_target()
with xen.git commit 883b30a0.  This patch does the same in the libxl
driver.  While at it, I changed the logic to essentially match
freemem() in $xensrc/tools/libxl/xl_cmdimpl.c.  It was a bit cleaner
IMO and will make it easier to spot future, potentially interesting
divergences.

Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx>
---

V2: Actually use libxl_wait_for_memory_target(), instead of
   libxl_wait_for_free_memory()

src/libxl/libxl_domain.c | 55 +++++++++++++++++++++++-------------------------
1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index 407a9bd..a1739aa 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -1121,38 +1121,39 @@ libxlDomainFreeMem(libxlDomainObjPrivatePtr priv, 
libxl_domain_config *d_config)
{
    uint32_t needed_mem;
    uint32_t free_mem;
-    size_t i;
-    int ret = -1;
+    int ret;

This variable is unnecessary and confusing since you don't return it.
And without this, you can make ...

    int tries = 3;
    int wait_secs = 10;

-    if ((ret = libxl_domain_need_memory(priv->ctx, &d_config->b_info,
-                                        &needed_mem)) >= 0) {
-        for (i = 0; i < tries; ++i) {
-            if ((ret = libxl_get_free_memory(priv->ctx, &free_mem)) < 0)
-                break;
+    ret = libxl_domain_need_memory(priv->ctx, &d_config->b_info, &needed_mem);
+    if (ret < 0)
+        goto error;


... these conditions smaller, e.g.:

if (libxl_domain_need_memory(priv->ctx, &d_config->b_info, &needed_mem) < 0)
    goto error;

Do you need this for the release? (sorry for noticing that late)

Attachment: pgpcUUHVOHksw.pgp
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.