[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-users] "xl restore" leaks a file descriptor?



On Wed, Aug 12, 2015 at 09:41:13AM +0100, Ian Campbell wrote:
> On Tue, 2015-08-11 at 18:07 +0100, Wei Liu wrote:
> > On Tue, Aug 11, 2015 at 04:48:13PM +0100, Ian Campbell wrote:
> > > On Tue, 2015-08-11 at 11:13 -0400, Andrew Armenia wrote:
> > > > It's the checkpoint file - i.e. the command line argument to xl
> > > > restore - that is being leaked.
> > > 
> > > Thanks.
> > > 
> > > [...]
> > > > So the checkpoint file is clearly being leaked.
> > > 
> > > Indeed. I confirmed this even with the current development version 
> > > using ls
> > > -l /proc/<pid>/fd which shows an fd open on a deleted file:
> > > 
> > > # ps aux| grep xl
> > > root     20465  0.0  0.2 106036   984 ?        SLsl 15:42   0:00 xl 
> > > restore save
> > > # ls -l /proc/20465/fd
> > > [...]
> > > lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save
> > > [...]
> > > # rm /root/save
> > > # ls -l /proc/20465/fd
> > > [...]
> > > lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save (deleted)
> > > [...]
> > > 
> > > >  Its space is not freed
> > > > until the 'xl restore' process is ended by shutting down the domain:
> > > [...]
> > > > 
> > > > It seems like xl restore should close the checkpoint file as soon as
> > > > it's done restoring the domain, allowing the space to be freed, but
> > > > that's clearly not happening.
> > > 
> > > Right. In fact xl sets the file to be close-on-exec right after opening 
> > > it,
> > > which is before the daemonisation step, so it ought to be closed
> > > automatically, but isn't for some reason.
> > > 
> > > My working theory is that something in the machinery which spawns the 
> > > save
> > > helper is defeating the use of CLOEXEC, perhaps by dup2() or perhaps by
> > > unsetting CLOEXEC.
> > > 
> > > Any way, thanks for reporting. I've copied the devel list and 4.6 RM. 
> > > Wei
> > > this probably ought to be a blocker for 4.6 (and the fix ought 
> > > ultimately
> > > to be backported to 4.4 onwards at least).
> > > 
> > > NB: This leak seems to be independent of the switch to migration v2.
> > > 
> > > Ian.
> > 
> > Maybe this is just because we leak a fd.
> > 
> > I don't see how CLOEXEC would be of any use if xl doesn't actually exec
> > anything.
> 
> Duh, for some reason I thought daemonize would activate the CLOEXEC, but
> it's just fork without exec. Silly me.
> 
> > 
> > Below is a PoC patch which seems to fix the problem for me.
> > 
> > ---8<---
> > commit 7b5f466d5977dc9f41991ca0c2227023ac07709d
> > Author: Wei Liu <wei.liu2@xxxxxxxxxx>
> > Date:   Tue Aug 11 18:02:25 2015 +0100
> > 
> >     xl: close restore_fd when we finish with it
> >     
> >     Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> > 
> > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> > index 499a05c..525cd24 100644
> > --- a/tools/libxl/xl_cmdimpl.c
> > +++ b/tools/libxl/xl_cmdimpl.c
> > @@ -2846,6 +2846,10 @@ start:
> >          ret = libxl_domain_create_new(ctx, &d_config, &domid,
> >                                        0, autoconnect_console_how);
> >      }
> > +
> > +    if (migrate_fd < 0)
> > +        close(restore_fd);
> 
> As Andy says I think we want restore_fd in the check, I can't see any
> reason we wouldn't want to close the socket too.
> 

Do you mean migrate_fd when you say "socket"? I tried that, but that led
to failure because toolstack still needs to get controlling information
out of it (the "GO" message).

Maybe I close this too early. I will have a closer look today.

> For reboot handing you would need to reset the fd to < 0, otherwise when we
> come back around on reboot we will close this again.
> 
> Would it be less error prone to put this in the if (restoring) just above,
> i.e. exactly where restore_fd is used and which already has the reboot
> logic in place with restoring = 0.
> 

Depending on whether we can close migrate_fd.

Wei.

> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.