Xen project Mailing List

Re: [Xen-devel] [PATCH 0 of 2] x86/mm: Unsharing ENOMEM handling

To: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>

From: Tim Deegan <tim@xxxxxxx>

Date: Thu, 15 Mar 2012 15:36:01 +0000

Cc: andres@xxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxx, adin@xxxxxxxxxxxxxx

Delivery-date: Thu, 15 Mar 2012 15:36:20 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

At 07:35 -0700 on 15 Mar (1331796917), Andres Lagar-Cavilla wrote: > > At 11:29 -0400 on 12 Mar (1331551776), Andres Lagar-Cavilla wrote: > >> These two patches were originally posted on Feb 15th as part of a larger > >> series. > >> > >> They were left to simmer as a discussion on wait queues took precedence. > >> > >> Regardless of the ultimate fate of wait queues, these two patches are > >> necessary > >> as they solve some bugs on the memory sharing side. When unsharing > >> fails, > >> domains would spin forever, hosts would crash, etc. > >> > >> The patches also clarify the semantics of unsharing, and comment how > >> it's > >> handled. > >> > >> Two comments against the Feb 15th series taken care of here: > >> - We assert that the unsharing code can only return success or ENOMEN. > >> - Acked-by Tim Deegan added to patch #1 > > > > Applied, thanks. > > > > I'm a bit uneasy about the way this increases the amount of boilerplate > > and p2m-related knowledge that's needed at call sites, but it fixes real > > problems and I can't see an easy way to avoid it. > > > Agreed, completely. Luckily it's all internal to the hypervisor. > > I'm gonna float an idea right now, risking egg-in-the-face again. Our main > issue is that going to sleep on a wait queue is disallowed in an atomic > context. For good reason, the vcpu goes to sleep holding locks. Therefore, > we can't magically hide all the complexity behind get_gfn, and callers > need to know things they shouldn't. > > However, sleeping only deadlocks if the "waker upper" would need to grab > any of those locks. Tempting. But I don't think it will fly -- in general dom0 tools should be able to crash and restart without locking up Xen. And anything that causes a VCPU to sleep forever with a lock held is likely to do that. Also we have to worry about anything that has to happen before the waker-upper gets to run -- for example, on a single-CPU Xen, any attempt by any code to get the lock that's held by the sleeper will hang forever because the waker-upper can't be scheduled. We could have some sort of time-out-and-crash-the-domain safety net, I guess, but part of the reason for wanting wait queues was avoiding plumbing all those error paths. Maybe we could just extend the idea and have the slow path of the spinlock code dump the caller on a wait queue in the hope that someone else will sort it out. :) Cheers, Tim. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.