[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] "xl restore" leaks a file descriptor?
On Tue, Aug 11, 2015 at 04:48:13PM +0100, Ian Campbell wrote: > On Tue, 2015-08-11 at 11:13 -0400, Andrew Armenia wrote: > > It's the checkpoint file - i.e. the command line argument to xl > > restore - that is being leaked. > > Thanks. > > [...] > > So the checkpoint file is clearly being leaked. > > Indeed. I confirmed this even with the current development version using ls > -l /proc/<pid>/fd which shows an fd open on a deleted file: > > # ps aux| grep xl > root 20465 0.0 0.2 106036 984 ? SLsl 15:42 0:00 xl restore > save > # ls -l /proc/20465/fd > [...] > lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save > [...] > # rm /root/save > # ls -l /proc/20465/fd > [...] > lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save (deleted) > [...] > > > Its space is not freed > > until the 'xl restore' process is ended by shutting down the domain: > [...] > > > > It seems like xl restore should close the checkpoint file as soon as > > it's done restoring the domain, allowing the space to be freed, but > > that's clearly not happening. > > Right. In fact xl sets the file to be close-on-exec right after opening it, > which is before the daemonisation step, so it ought to be closed > automatically, but isn't for some reason. > > My working theory is that something in the machinery which spawns the save > helper is defeating the use of CLOEXEC, perhaps by dup2() or perhaps by > unsetting CLOEXEC. > > Any way, thanks for reporting. I've copied the devel list and 4.6 RM. Wei > this probably ought to be a blocker for 4.6 (and the fix ought ultimately > to be backported to 4.4 onwards at least). > > NB: This leak seems to be independent of the switch to migration v2. > > Ian. Maybe this is just because we leak a fd. I don't see how CLOEXEC would be of any use if xl doesn't actually exec anything. Below is a PoC patch which seems to fix the problem for me. ---8<--- commit 7b5f466d5977dc9f41991ca0c2227023ac07709d Author: Wei Liu <wei.liu2@xxxxxxxxxx> Date: Tue Aug 11 18:02:25 2015 +0100 xl: close restore_fd when we finish with it Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 499a05c..525cd24 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -2846,6 +2846,10 @@ start: ret = libxl_domain_create_new(ctx, &d_config, &domid, 0, autoconnect_console_how); } + + if (migrate_fd < 0) + close(restore_fd); + if ( ret ) goto error_out; > > > -Andrew > > > > On Aug 11, 2015 04:55, "Ian Campbell" <ian.campbell@xxxxxxxxxx> wrote: > > > > > > On Fri, 2015-08-07 at 12:50 -0400, Andrew Armenia wrote: > > > > The issue appears to occur with any state file - not just one in > > > > particular. > > > > > > Please give some specific examples e.g. paths to some of the files to > > > which > > > a fd has been leaked. I'm trying to determine which state files I > > > should be > > > investigating, since there are several things which an end user might > > > consider a "state file". > > > > > > Ian. _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |