Xen project Mailing List

Re: [Xen-devel] Wait Queues

From: Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>

Date: Thu, 8 Nov 2012 10:39:23 -0500

Cc: Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 08 Nov 2012 15:39:28 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Nov 8, 2012, at 2:42 AM, Keir Fraser <keir.xen@xxxxxxxxx> wrote: > On 08/11/2012 03:22, "Andres Lagar-Cavilla" <andreslc@xxxxxxxxxxxxxx> wrote: > >>> I'd like to propose an approach that ensures that as long some properties >>> are >>> met, arbitrary wait queue sleep is allowed. Here are the properties: >>> 1. Third parties servicing a wait queue sleep are indeed third parties. In >>> other words, dom0 does not do paging. >>> 2. Vcpus of a wait queue servicing domain may never go to sleep on a wait >>> queue during a foreign map. >>> 3. A guest vcpu may go to sleep on a wait queue holding any kinds of locks >>> as >>> long as it does not hold the p2m lock. >> >> N.B: I understand (now) this may cause any other vcpu contending on a lock >> held by the wait queue sleeper to not yield to the scheduler and pin its >> physical cpu. >> >> What I am struggling with is coming up with a solution that doesn't turn >> hypervisor mm hacking into a locking minefield. >> >> Linux fixes this with many kinds of sleeping synchronization primitives. A >> task can, for example, hold the mmap semaphore and sleep on a wait queue. Is >> this the only way out of this mess? Not if wait queues force the vcpu to wake >> up on the same phys cpu it was using at the time of sleepingŠ. > > Well, the forcing to wake up on same phys cpu it slept on is going to be > fixed. But it's not clear to me how that current restriction makes the > problem harder? What if you were running on a single-phys-cpu system? It's not a hard blocker. It's giving up efficiency otherwise. It's a "nice to have" precondition. > > As you have realised, the fact that all locks in Xen are spinlocks makes the > potential for deadlock very obvious. Someone else gets scheduled and takes > out the phys cpu by spinning on a lock that someone else is holding while > they are descheduled. > > Linux-style sleeping mutexes might help. We could add those. They don't help > as readily as in the Linux case however! In some ways they push the deadlock > up one level of abstraction, to the virt cpu (vcpu). Consider single-vcpu > dom0 running a pager -- even if you are careful that the pager itself does > not acquire any locks that one of its clients may hold-while-sleeping, if > *anything* running in dom0 can acquire such a lock, you have an obvious > deadlock, as that will take out the dom0 vcpu and leave it blocked forever > waiting for a lock that is held while its holder waits for service from the > dom0 vcpu…. Uhmm. But it seems there is _some_ method to the madness. Luckily mm locks are all taken after the p2m lock (and enforced that way). dom0 can grab ... the big domain lock? the grant table lock? Perhaps we can categorize locks between reflexive or foreign (not that we have abundant space in the spin lock struct to stash more flags) and perform some sort of enforcement like what goes on in the mm layer. Xen insults via BUG_ON's are a strong conditioning tool for developers. It is certainly simpler to tease out the locks that might deadlock dom0 than all possible locks, including RCU read-locks. What I mean: BUG_ON(current->domain != d && lock_is_reflexive) An example of a reflexive lock is the per page sharing lock. BUG_ON(prepare_to_wait && current->domain->holds_foreign_lock) An example of a transitive lock is the gran table lock. A third category would entail global locks like the domain list, which are identical to a foreign lock wrt to this analysis. Another benefit of this is that only reflexive locks need to be made sleep-capable, not everything under the sun. I.e. the possibility of livelock is corralled to apply only to vcpus of the same domain, and then it's avoided by making those lock holders re-schedulable. Andres > > I don't think there is an easy solution here! > > -- Keir > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

Follow-Ups:

Re: [Xen-devel] Wait Queues
- From: Keir Fraser

References:

Re: [Xen-devel] Wait Queues
- From: Keir Fraser

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.