[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen dom0 crash: "d0:v0: unhandled page fault (ec=0000)"
On 10/29/2010 12:15 PM, Konrad Rzeszutek Wilk wrote: > On Fri, Oct 29, 2010 at 04:44:23PM +0100, Gianni Tedesco wrote: >> On Wed, 2010-10-20 at 09:54 +0100, Gianni Tedesco wrote: >>> On Wed, 2010-10-20 at 00:31 +0100, Andreas Kinzler wrote: >>>> On 19.10.2010 17:45, Gianni Tedesco wrote: >>>>> ditto, I suspected a known bug in my gcc version which broke xchg >>>>> because when I compiled with -O2 instead of -Os... the problem went away >>>>> but then something else bad happened later (I forget the details, and it >>>>> was too many config tweaks ago to get back to last time I had it working >>>>> that well) >>>> Jeremy, one user earlier reported that he found out that for him there >>>> seems to be a relation between kernel size and crash status. He just >>>> added/removed some options that could never influence the "crash status" >>>> (like adding/removing netfilter modules). With all the experiences here, >>>> is may be useful to check for code paths related to kernel size. >>>> >>>> Regards Andreas >> I have dmesg output from 2.6.32.18-ge6b9b2c and the current broken >> version. >> >> http://pastebin.com/3m0DpDdW - 2.6.32.24-gd0054d6-dirty - broken > Gianni pointed out to me that he spotted this: > > [ 0.000000] last_pfn = 0x2d0699 max_arch_pfn = 0x400000000 > [ 0.000000] x86 PAT enabled: cpu 0, old 0x50100070406, new 0x7010600070106 > [ 0.000000] last_pfn = 0x2f000 max_arch_pfn = 0x400000000 > > I am not sure why "last_pfn" is being printed twice, but it could be > Gianni test-patch. > > It looks as if the initial E820 is created with a max_pfn of > 0x2d0699, which rougly translates to 8G of memory instead of > the 752MB. > > There were a bunch of changes in arch/x86/xen/setup.c and mmu.c > code that figures out the max_pfn. Actually, there is one > (git commit 6c8e75f5e712e596ab138597e65aac426ff03382): > > HYPERVISOR_shared_info->arch.max_pfn = xen_max_p2m_pfn That sets the extent that the toolstack will look at the P2M for migration; it has no direct effect on the domain itself. > Which would set the this to the highest PFN. But that number > should not have been used by the E820 calculation which uses > nr_pages entry to clamp the E820. Oh wait, it does not - it actually > still parses the E820, but marks the area above the nr_pages > as "XEN EXTRA" (git commit 8d0d6d6d275d4514780ba3d350e57d48e3b5b5e1) > so they should not figure in the last_pfn calculation and instead > lay unused. But the 'initial memory mapping' ignores that and > still tries to setup mapping on _all_ E820_RAM regions, even > if they are reserved from by the early memory allocator. This would > imply that the page table is being actually put right in the > area that is reserved by the early memory allocator. > > Hmm, so Gianni, I think if you shortcircuited the setup.c code > to not parse the E820_RAM regions above the nr_pages that might > do it. And also try to figure out who or what resets the last_pfn. > > Or in the code that sets the 'XEN EXTRA', make it set that region > of pages as E820_RESERVED and see what happens then. The way is this is supposed to work is: 1. Xen gives the domain N pages 2. There's an E820 which describes M pages (M > N) 3. The kernel traverses the existing E820 and finds holes and adds the memory to a new E820_RAM region beyond M 4. Set up P2M for pages up to N 5. When the kernel maps all "RAM", the region from N-M is not present, and has no valid P2M mapping; in that case, xen_make_pte will return a non-present pte. The important part of making XEN EXTRA E820_RAM is that the kernel will allocate page structures for them, even if the pages are absent. Making it RESERVED will suppress that and make the exercise pointless. J _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |