[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Need help with fixing the Xen waitqueue feature


  • To: "Olaf Hering" <olaf@xxxxxxxxx>
  • From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
  • Date: Wed, 9 Nov 2011 13:30:07 -0800
  • Cc: keir.xen@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 09 Nov 2011 13:31:04 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=mm6GUCYfr7HMCvkUvxNqy5ErojvykRTy0f26i5eRNoRO DFbzRM3sZkQ/aCLjby16AqFH/n8zytYJ5sqVyP3l5+fO9yyJYfN9hx1ZWcraH+EB Zd7q2AoekS/YCggLODxabLLjGjccZKL21Eg7SXHMCguNGmRHwGE3ydbYW/RR8Ho=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Also,
> On Tue, Nov 08, Andres Lagar-Cavilla wrote:
>
>> Tbh, for paging to be effective, we need to be prepared to yield on
>> every
>> p2m lookup.
>
> Yes, if a gfn is missing the vcpu should go to sleep rather than
> returning -ENOENT to the caller. Only the query part of gfn_to_mfn
> should return the p2m paging types.
>
>> Let's compare paging to PoD. They're essentially the same thing: pages
>> disappear, and get allocated on the fly when you need them. PoD is a
>> highly optimized in-hypervisor optimization that does not need a
>> user-space helper -- but the pager could do PoD easily and remove all
>> that
>> p2m-pod.c code from the hypervisor.
>
> Perhaps PoD and paging could be merged, I havent had time to study the
> PoD code.
>
>> PoD only introduces extraneous side-effects when there is a complete
>> absence of memory to allocate pages. The same cannot be said of paging,
>> to
>> put it mildly. It returns EINVAL all over the place. Right now, qemu can
>> be crashed in a blink by paging out the right gfn.
>
> I have seen qemu crashes when using emulated storage, but havent
> debugged them yet. I suspect they were caused by a race between nominate
> and evict.

After a bit of thinking, things are far more complicated. I don't think
this is a "race." If the pager removed a page that later gets scheduled by
the guest OS for IO, qemu will want to foreign-map that. With the
hypervisor returning ENOENT, the foreign map will fail, and there goes
qemu.

Same will happen for pv backend mapping grants, or the checkpoint/migrate
code.

I guess qemu/migrate/libxc could retry until the pager is done and the
mapping succeeds. It will be delicate. It won't work for pv backends. It
will flood the mem_event ring.

Wait-queueing the dom0 vcpu is a no-go -- the machine will deadlock quicly.

My thinking is that the best bet is to wait-queue the dom0 process. The
dom0 kernel code handling the foreign map will need to put the mapping
thread in a wait-queue. It can establish a ring-based notification
mechanism with Xen. When Xen completes the paging in, it can add a
notification to the ring. dom0 can then awake the mapping thread and
retry.

Not simple at all. Ideas out there?

Andres

>
> Olaf
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.