[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Need help with fixing the Xen waitqueue feature

To: "Olaf Hering" <olaf@xxxxxxxxx>
From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
Date: Wed, 9 Nov 2011 20:29:19 -0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, keir.xen@xxxxxxxxx, Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>
Delivery-date: Wed, 09 Nov 2011 20:30:02 -0800
Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=GY199qOuFAA/t4gRTt0oOT/RrgGDd9UMXKlIhjyOlBtB CDaJUF77FJLPz9SJJEzusEkc5gMNJAkoZdn3ZHF9ix5eqklm3Cef1S/vWK5RYWf2 vXW9teco2+NMOmY+KX2ju+Ycps2gdyA78LJvHxeJsrtAcs7E4d7vIrEKG4JL4HU=
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Olaf,
> On Wed, Nov 09, Andres Lagar-Cavilla wrote:
>
>> After a bit of thinking, things are far more complicated. I don't think
>> this is a "race." If the pager removed a page that later gets scheduled
>> by
>> the guest OS for IO, qemu will want to foreign-map that. With the
>> hypervisor returning ENOENT, the foreign map will fail, and there goes
>> qemu.
>
> The tools are supposed to catch ENOENT and try again.
> linux_privcmd_map_foreign_bulk() does that. linux_gnttab_grant_map()
> appears to do that as well. What code path uses qemu that leads to a
> crash?

The tools retry as long as IOCTL_PRIVCMD_MMAPBATCH_V2 is supported. Which
it isn't on mainline linux 3.0, 3.1, etc. Which dom0 kernel are you using?

And for backend drivers implemented in the kernel (netback, etc), there is
no retrying.

All those ram_paging types and their interactions give me a headache, but
I'll trust you that only one event is put in the ring.

I'm using 24066:54a5e994a241. I start windows 7, make xenpaging try to
evict 90% of the RAM, qemu lasts for about two seconds. Linux fights
harder, but qemu also dies. No pv drivers. I haven't been able to trace
back the qemu crash (segfault on a NULL ide_if field for a dma callback)
to the exact paging action yet, but no crashes without paging.

Andres

>
>> I guess qemu/migrate/libxc could retry until the pager is done and the
>> mapping succeeds. It will be delicate. It won't work for pv backends. It
>> will flood the mem_event ring.
>
> There will no flood, only one request is sent per gfn in
> p2m_mem_paging_populate().
>
> Olaf
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Olaf Hering
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Keir Fraser
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Jan Beulich

References:
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Andres Lagar-Cavilla
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Olaf Hering
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Andres Lagar-Cavilla
- [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
  - From: Olaf Hering

Prev by Date: [Xen-devel] Cannot get shared page in domU
Next by Date: Re: [Xen-devel] Cannot get shared page in domU
Previous by thread: [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
Next by thread: [Xen-devel] Re: Need help with fixing the Xen waitqueue feature
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.