[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/3] libxl: Fix libxl_postfork_child_noexec deadlock etc.
On Mon, 2014-02-24 at 14:19 +0000, Ian Jackson wrote: > libxl_postfork_child_noexec would nestedly reaquire the non-recursive > "no_forking" mutex: atfork_lock uses it, as does sigchld_user_remove. > The result on Linux is that the process always deadlocks before > returning from this function. > > This is used by xl's console child. So, the ultimate effect is that > xl with pygrub does not manage to connect to the pygrub console. > This beahviour was reported by Michael Young in Xen 4.4.0 RC5. "behaviour". Michael reported this earlier on -rc2 as well but it fell through the cracks because I failed to properly appreciate the severity. Sorry. > Also, the use of sigchld_user_remove in libxl_postfork_child_noexec is > not correct with SIGCHLD sharing. libxl_postfork_child_noexec is > documented to suffice if called only on one ctx. So deregistering the > ctx it's called on is not sufficient. Instead, we need a new approach > which discards the whole sigchld_user list and unconditionally removes > our SIGCHLD handler if we had one. > > Prompted by this, clarify the semantics of > libxl_postfork_child_noexec. Specifically, expand on the meaning of > "quickly" by explaining what operations are not permitted; and > document the fact that the function doesn't reclaim the resources in > the ctxs. > > And add a comment in libxl_postfork_child_noexec explaining the > internal concurrency situation. > > This is an important bugfix. IMO the bug is a blocker for Xen 4.4. > > Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> > Reported-by: M A Young <m.a.young@xxxxxxxxxxxx> Acked-by: Ian Campbell <Ian.Campbell@xxxxxxxxxx> > CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx> > --- > tools/libxl/libxl_event.h | 16 ++++++++++++++++ > tools/libxl/libxl_fork.c | 44 +++++++++++++++++++++++++++++++++++++++++++- > 2 files changed, 59 insertions(+), 1 deletion(-) Impressive considering the real meat is -1/+6 ;-) Not that I'm going to complain about lots of docs! > > @@ -134,7 +150,33 @@ void libxl_postfork_child_noexec(libxl_ctx *ctx) > } > LIBXL_LIST_INIT(&carefds); > > - sigchld_user_remove(ctx); > + if (sigchld_installed) { > + defer_sigchld(); > + > + LIBXL_LIST_INIT(&sigchld_users); > + /* After this the ->sigchld_user_registered entries in the > + * now-obsolete contexts may be lies. But that's OK because > + * no-one will look at them. */ > + > + release_sigchld(); > + sigchld_removehandler_core(); > + } > > atfork_unlock(); > } _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |