[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] BUG: bad page map under Xen
Konrad Rzeszutek Wilk wrote on 2013-10-25: > On Fri, Oct 25, 2013 at 12:08:21AM +0100, David Vrabel wrote: >> On 23/10/13 16:36, Konrad Rzeszutek Wilk wrote: >>> On Mon, Oct 21, 2013 at 04:12:56PM +0100, Jan Beulich wrote: >>>>>>> On 21.10.13 at 16:44, Konrad Rzeszutek Wilk > <konrad.wilk@xxxxxxxxxx> wrote: >>>>> On Mon, Oct 21, 2013 at 03:27:50PM +0100, Jan Beulich wrote: >>>>>>>>> On 21.10.13 at 16:18, Konrad Rzeszutek Wilk > <konrad.wilk@xxxxxxxxxx> wrote: >>>>>>> On Mon, Oct 21, 2013 at 04:06:07PM +0200, Lukas Hejtmanek wrote: >>>>>>>> Region 2: Memory at 380fff000000 (64-bit, >>>>>>>> prefetchable) [size=8M] >>>>>>> ... >>>>>>> --- a/arch/x86/xen/setup.c >>>>>>> +++ b/arch/x86/xen/setup.c >>>>>>> @@ -92,6 +92,9 @@ static void __init xen_add_extra_mem(u64 >>>>>>> start, >>>>>>> u64 size) >>>>>>> >>>>>>> __set_phys_to_machine(pfn, > INVALID_P2M_ENTRY); >>>>>>> } >>>>>>> + /* Anything past the balloon area is marked as identity. >>>>>>> */ + for (pfn = xen_max_p2m_pfn; pfn < MAX_DOMAIN_PAGES; >>>>>>> pfn++) + __set_phys_to_machine(pfn, > IDENTITY_FRAME(pfn)); >>>>>> >>>>>> Hardly - MAX_DOMAIN_PAGES derives from >>>>>> CONFIG_XEN_MAX_DOMAIN_MEMORY, which in turn is unrelated to where >>>>>> MMIO might be. Should you perhaps simply start from >>>>> >>>>> Looks like your mailer ate some words. >>>> >>>> I don't think so - they're all there in the text you quoted. >>>> >>>>>> an all 1:1 mapping, inserting the RAM translations as you find >>>>>> them? >>>>> >>>>> >>>>> Yeah, as this code can be called for the regions under 4GB. >>>>> Definitly needs more analysis. >>>>> >>>>> Were you suggesting a lookup when we scan the PCI devices? > (xen_add_device)? >>>> >>>> That was for PVH, and is obviously fragile, as there can be MMIO >>>> regions not matched by any PCI device's BAR. We could hope for all >>>> of them to be below 4Gb, but I think (based on logs I got to see >>>> recently from a certain vendor's upcoming systems) this isn't >>>> going to work out. >>> >>> This is the patch I had in mind that I think will fix these issues. >>> But I would appreciate testing it and naturally send me the dmesg >>> if > possible. >> >> I think there is a simpler way to handle this. >> >> If INVALID_P2M_ENTRY implies 1:1 and we arrange: > > I am a bit afraid to make that assumption. >> >> a) pfn_to_mfn() to return pfn if the mfn is missing in the p2m > > The balloon pages are of missing type (initially). And they should > return INVALID_P2M_ENTRY at start - later on they will return the > scratch_page. > >> b) mfn_to_pfn() to return mfn if p2m(m2p(mfn)) != mfn and there is >> no m2p override. > > The toolstack can map pages that are are p2m(p2m(mfn)) != mfn and have > no m2p override. > >> >> Then: >> >> a) The identity p2m entries can be removed. >> b) _PAGE_IOMAP becomes unnecessary. > > You still need it for the toolstack to map other guests pages. > (xen_privcmd_map). > > I think for right now to fix this issue going ahead and setting > 1-1 in the P2M for affected devices (PCI and MCFG) is simpler, b/c: > - We only do it when said device is in the guest (so if you launch > and PCI PV guest you can still migrate it - after unplugging the > device). Assuming all 1-1 regions might not be a healthy (I had a > heck of time fixing all of the migration issues when I wrote the 1:1 > code). - It will make PVH hypercall to mark I/O regions easier. > Instead of it assuming that all non-RAM space is I/O regions it will > be able to selectively setup the entries for said regions. I think > that is what Jan suggested? > - This is a bug - so lets fix it as a bug first. > Redoing the P2M is certainly an option but I am not signing up for that this > year. > Let me post my two patches that fix this for PCI devices and MCFG areas. > Any conclusion for this issue? Our custom also saw the same issue that they want to map an MMIO to userspace address(through UIO approach), but current ->mmap function call remap_pfn_range without setting _PAGE_IOMAP and cause the host crashed. It seems all userspace device drivers that tries to map device's mmio will caused host crashed. They are using 3.10 kernel. > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |