[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] dom0 pvops crash
On 01/27/2010 09:26 AM, Ian Campbell wrote: On Mon, 2010-01-25 at 20:02 +0000, Jeremy Fitzhardinge wrote:IanC, Pasi, myself and others explored a number of other ways to try and fix it in the Xen pvops code, but they all turned out to be very expensive, just not work (they just pushed the race around), or require new pvops just for this case.Just to brainstorm a bit more: There's no way a kunmap_atomic pvop would be acceptable? it would at least make the API symmetrical. We could propose it, but I think we have bigger things to spend our capital on. And I'm not sure it would help: In theory xen_kmap_atomic could take the pte lock and unmap_atomic could release it. But kmap_atomic doesn't have enough info be able to take the lock and unmap wouldn't either unless we passed it some odd parameters. And even if we did take the lock, the calling kernel code will also attempt to take the lock if it actually wants to make a pte change, so we'd have to change the logic there. What about a hypercall which would set a PTE with the writable bit set atomically depending on the pinned status of the referenced page? (I haven't even vaguely thought this idea through). It doesn't really help because the core issue is the race which changes the page state half way through. If we create a writable mapping, a pin on another CPU is going to fail. We could fix it by locking the pte while it is mapped, but then we wouldn't need a new hypercall. Is there some way we can disable HIGHPTE at runtime even if CONFIG_HIGHPTE=y? Looks like that might be relatively self-contained in pte_alloc_one(). All the actual uses of high PTEs goes through kmap_atomic which explicitly tests for PageHighmem so by ensuring PTEs are never high at allocation time we would skip all those paths. Something like the untested patch below, but not so skanky, obviously. That's a thought. It could be generally useful too; highpte should only be used in extreme circumstances (to prevent ptes from filling most of lowmem), not on every system with highmem. IOW use a generic flag rather than make it explicitly Xen-related, then we can set that flag. Or we could just put a big fat config dependency in. This last would be nice since it also remove the crippling-for-virtualisation overhead, so it would potentially benefit KVM and VMI as well... VMI is a non-issue, and I don't think HIGHPTE is extraordinarily expensive on kvm. Given that HIGHPTE is generally a bad idea and should be deprecated (any machine big enough to need it should definitely be running a 64-bit kernel), I've left it on the backburner hoping for some inspiration to strike. So far it has not.Unfortunately distros seem to be using it for their native kernels and since pvops means they won't have a separate xen kernel I think we need to figure something out. We could lobby for them to turn it off. I wonder if they have a real user demand for it these days. It could only be important for users with lots of physical memory and a 32-bit only CPU, which can't be common now. (There should be no problem with using a 64-bit kernel, even if userspace is all 32-bit.). J Ian. diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 65215ab..49f8e83 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -28,7 +28,10 @@ pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long address) struct page *pte; #ifdef CONFIG_HIGHPTE - pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 0); + if (is_xen_domain()) + pte = alloc_pages(PGALLOC_GFP, 0); + else + pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 0); #else pte = alloc_pages(PGALLOC_GFP, 0); #endif _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |