[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS
On Wed, 15 Aug 2012, David Vrabel wrote: > On 14/08/12 15:12, Attilio Rao wrote: > > On 14/08/12 14:57, David Vrabel wrote: > >> On 14/08/12 13:24, Attilio Rao wrote: > >> > >>> The informations added on the hook are: - Native behaviour - Xen > >>> specific behaviour - Logic behind the Xen specific behaviour > >>> > >> These are implementation details and should be documented with the > >> implementations (if necessary). > >> > > > > In this specific case, implementation details are very valuable to > > understand the semantic of operations, this is why I added them > > there. I think, at least for this case, this is the best trade-off. > > Documenting the implementation details will be useful for reviewing or > refactoring the pv-ops but I don't think it useful in the longer term > for it to be included in the API documentation upstream. > > >>> - PVOPS semantic > >>> > >> This is the interesting stuff. > >> > >> This particular pvop seems a little odd really. It might make more > >> sense if it took a third parameter for pgt_buf_top. > > > > The thing is that this work (documenting PVOPS) should help in > > understanding the logic behind some PVOPS and possibly > > improve/rework them. For this stage, I agreed with Konrad to keep the > > changes as small as possible. Once the documentation about the > > semantic is in place we can think about ways to improve things more > > effectively (for example, in some cases we may want to rewrite the > > PVOP completely). > > After looking at it some more, I think this pv-ops is unnecessary. How > about the following patch to just remove it completely? > > I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning > is sound. Do you have more then 4G to dom0 on those boxes? It certainly fixed a serious crash at the time it was introduced, see http://marc.info/?l=linux-kernel&m=129901609503574 and http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big changed in kernel_physical_mapping_init, I think we still need it. Depending on the e820 of your test box, the kernel could crash (or not), possibly in different places. > >> "@pagetable_reserve is used to reserve a range of PFNs used for the > >> kernel direct mapping page tables and cleans-up any PFNs that ended > >> up not being used for the tables. > >> > >> It shall reserve the range (start, end] with memblock_reserve(). It > >> shall prepare PFNs in the range (end, pgt_buf_top] for general (non > >> page table) use. > >> > >> It shall only be called in init_memory_mapping() after the direct > >> mapping tables have been constructed." > >> > >> Having said that, I couldn't immediately see where pages in (end, > >> pgt_buf_top] was getting set RO. Can you point me to where it's > >> done? > >> > > > > As mentioned in the comment, please look at xen_set_pte_init(). > > xen_set_pte_init() only ensures it doesn't set the PTE as writable if it > is already present and read-only. look at mask_rw_pte and read the threads linked above, unfortunately it is not that simple. > David > > 8<---------------------- > x86: remove x86_init.mapping.pagetable_reserve paravirt op > > The x86_init.mapping.pagetable_reserve paravirt op is used for Xen > guests to set the writable flag for the mapping of (pgt_buf_end, > pgt_buf_top]. This is not necessary as these pages are never set as > read-only as they have never contained page tables. Is this actually true? It is possible when pagetable pages are allocated by alloc_low_page. > When running as a Xen guest, the initial page tables are provided by > Xen (these are reserved with memblock_reserve() in > xen_setup_kernel_pagetable()) and constructed in brk space (for 32-bit > guests) or in the kernel's .data section (for 64-bit guests, see > head_64.S). > > Since these are all marked as reserved, (pgt_buf_start, pgt_buf_top] > does not overlap with them and the mappings for these PFNs will be > read-write. We are talking about pagetable pages built by kernel_physical_mapping_init. > Since Xen doesn't need to change the mapping its implementation > becomes the same as a native and we can simply remove this pv-op > completely. I don't think so. > Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> > --- > arch/x86/include/asm/pgtable_types.h | 1 - > arch/x86/include/asm/x86_init.h | 12 ------------ > arch/x86/kernel/x86_init.c | 4 ---- > arch/x86/mm/init.c | 22 +++------------------- > arch/x86/xen/mmu.c | 19 ++----------------- > 5 files changed, 5 insertions(+), 53 deletions(-) > > diff --git a/arch/x86/include/asm/pgtable_types.h > b/arch/x86/include/asm/pgtable_types.h > index 013286a..0a11293 100644 > --- a/arch/x86/include/asm/pgtable_types.h > +++ b/arch/x86/include/asm/pgtable_types.h > @@ -301,7 +301,6 @@ int phys_mem_access_prot_allowed(struct file *file, > unsigned long pfn, > /* Install a pte for a particular vaddr in kernel space. */ > void set_pte_vaddr(unsigned long vaddr, pte_t pte); > > -extern void native_pagetable_reserve(u64 start, u64 end); > #ifdef CONFIG_X86_32 > extern void native_pagetable_setup_start(pgd_t *base); > extern void native_pagetable_setup_done(pgd_t *base); > diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h > index 38155f6..b527dd4 100644 > --- a/arch/x86/include/asm/x86_init.h > +++ b/arch/x86/include/asm/x86_init.h > @@ -69,17 +69,6 @@ struct x86_init_oem { > }; > > /** > - * struct x86_init_mapping - platform specific initial kernel pagetable setup > - * @pagetable_reserve: reserve a range of addresses for kernel > pagetable usage > - * > - * For more details on the purpose of this hook, look in > - * init_memory_mapping and the commit that added it. > - */ > -struct x86_init_mapping { > - void (*pagetable_reserve)(u64 start, u64 end); > -}; > - > -/** > * struct x86_init_paging - platform specific paging functions > * @pagetable_setup_start: platform specific pre paging_init() call > * @pagetable_setup_done: platform specific post paging_init() call > @@ -135,7 +124,6 @@ struct x86_init_ops { > struct x86_init_mpparse mpparse; > struct x86_init_irqs irqs; > struct x86_init_oem oem; > - struct x86_init_mapping mapping; > struct x86_init_paging paging; > struct x86_init_timers timers; > struct x86_init_iommu iommu; > diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c > index 9f3167e..040c05f 100644 > --- a/arch/x86/kernel/x86_init.c > +++ b/arch/x86/kernel/x86_init.c > @@ -63,10 +63,6 @@ struct x86_init_ops x86_init __initdata = { > .banner = default_banner, > }, > > - .mapping = { > - .pagetable_reserve = native_pagetable_reserve, > - }, > - > .paging = { > .pagetable_setup_start = native_pagetable_setup_start, > .pagetable_setup_done = native_pagetable_setup_done, > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c > index e0e6990..c449873 100644 > --- a/arch/x86/mm/init.c > +++ b/arch/x86/mm/init.c > @@ -90,11 +90,6 @@ static void __init find_early_table_space(struct map_range > *mr, unsigned long en > (pgt_buf_top << PAGE_SHIFT) - 1); > } > > -void __init native_pagetable_reserve(u64 start, u64 end) > -{ > - memblock_reserve(start, end - start); > -} > - > #ifdef CONFIG_X86_32 > #define NR_RANGE_MR 3 > #else /* CONFIG_X86_64 */ > @@ -283,22 +278,11 @@ unsigned long __init_refok init_memory_mapping(unsigned > long start, > > /* > * Reserve the kernel pagetable pages we used (pgt_buf_start - > - * pgt_buf_end) and free the other ones (pgt_buf_end - pgt_buf_top) > - * so that they can be reused for other purposes. > - * > - * On native it just means calling memblock_reserve, on Xen it also > - * means marking RW the pagetable pages that we allocated before > - * but that haven't been used. > - * > - * In fact on xen we mark RO the whole range pgt_buf_start - > - * pgt_buf_top, because we have to make sure that when > - * init_memory_mapping reaches the pagetable pages area, it maps > - * RO all the pagetable pages, including the ones that are beyond > - * pgt_buf_end at that time. > + * pgt_buf_end). > */ > if (!after_bootmem && pgt_buf_end > pgt_buf_start) > - x86_init.mapping.pagetable_reserve(PFN_PHYS(pgt_buf_start), > - PFN_PHYS(pgt_buf_end)); > + memblock_reserve(PFN_PHYS(pgt_buf_start), > + PFN_PHYS(pgt_buf_end) - > PFN_PHYS(pgt_buf_start)); > > if (!after_bootmem) > early_memtest(start, end); > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index b65a761..e55dfc0 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -1178,20 +1178,6 @@ static void __init xen_pagetable_setup_start(pgd_t > *base) > { > } > > -static __init void xen_mapping_pagetable_reserve(u64 start, u64 end) > -{ > - /* reserve the range used */ > - native_pagetable_reserve(start, end); > - > - /* set as RW the rest */ > - printk(KERN_DEBUG "xen: setting RW the range %llx - %llx\n", end, > - PFN_PHYS(pgt_buf_top)); > - while (end < PFN_PHYS(pgt_buf_top)) { > - make_lowmem_page_readwrite(__va(end)); > - end += PAGE_SIZE; > - } > -} > - > static void xen_post_allocator_init(void); > > static void __init xen_pagetable_setup_done(pgd_t *base) > @@ -2067,7 +2053,6 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst > = { > > void __init xen_init_mmu_ops(void) > { > - x86_init.mapping.pagetable_reserve = xen_mapping_pagetable_reserve; > x86_init.paging.pagetable_setup_start = xen_pagetable_setup_start; > x86_init.paging.pagetable_setup_done = xen_pagetable_setup_done; > pv_mmu_ops = xen_mmu_ops; > -- > 1.7.2.5 > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |