|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] "xl restore" leaks a file descriptor?
On Tue, Aug 11, 2015 at 04:48:13PM +0100, Ian Campbell wrote:
> On Tue, 2015-08-11 at 11:13 -0400, Andrew Armenia wrote:
> > It's the checkpoint file - i.e. the command line argument to xl
> > restore - that is being leaked.
>
> Thanks.
>
> [...]
> > So the checkpoint file is clearly being leaked.
>
> Indeed. I confirmed this even with the current development version using ls
> -l /proc/<pid>/fd which shows an fd open on a deleted file:
>
> # ps aux| grep xl
> root 20465 0.0 0.2 106036 984 ? SLsl 15:42 0:00 xl restore
> save
> # ls -l /proc/20465/fd
> [...]
> lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save
> [...]
> # rm /root/save
> # ls -l /proc/20465/fd
> [...]
> lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save (deleted)
> [...]
>
> > Its space is not freed
> > until the 'xl restore' process is ended by shutting down the domain:
> [...]
> >
> > It seems like xl restore should close the checkpoint file as soon as
> > it's done restoring the domain, allowing the space to be freed, but
> > that's clearly not happening.
>
> Right. In fact xl sets the file to be close-on-exec right after opening it,
> which is before the daemonisation step, so it ought to be closed
> automatically, but isn't for some reason.
>
> My working theory is that something in the machinery which spawns the save
> helper is defeating the use of CLOEXEC, perhaps by dup2() or perhaps by
> unsetting CLOEXEC.
>
> Any way, thanks for reporting. I've copied the devel list and 4.6 RM. Wei
> this probably ought to be a blocker for 4.6 (and the fix ought ultimately
> to be backported to 4.4 onwards at least).
>
> NB: This leak seems to be independent of the switch to migration v2.
>
> Ian.
Maybe this is just because we leak a fd.
I don't see how CLOEXEC would be of any use if xl doesn't actually exec
anything.
Below is a PoC patch which seems to fix the problem for me.
---8<---
commit 7b5f466d5977dc9f41991ca0c2227023ac07709d
Author: Wei Liu <wei.liu2@xxxxxxxxxx>
Date: Tue Aug 11 18:02:25 2015 +0100
xl: close restore_fd when we finish with it
Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 499a05c..525cd24 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2846,6 +2846,10 @@ start:
ret = libxl_domain_create_new(ctx, &d_config, &domid,
0, autoconnect_console_how);
}
+
+ if (migrate_fd < 0)
+ close(restore_fd);
+
if ( ret )
goto error_out;
>
> > -Andrew
> >
> > On Aug 11, 2015 04:55, "Ian Campbell" <ian.campbell@xxxxxxxxxx> wrote:
> > >
> > > On Fri, 2015-08-07 at 12:50 -0400, Andrew Armenia wrote:
> > > > The issue appears to occur with any state file - not just one in
> > > > particular.
> > >
> > > Please give some specific examples e.g. paths to some of the files to
> > > which
> > > a fd has been leaked. I'm trying to determine which state files I
> > > should be
> > > investigating, since there are several things which an end user might
> > > consider a "state file".
> > >
> > > Ian.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |