[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS
On Wed, 15 Aug 2012, David Vrabel wrote: > On 15/08/12 14:55, Stefano Stabellini wrote: > > On Wed, 15 Aug 2012, David Vrabel wrote: > >> On 14/08/12 15:12, Attilio Rao wrote: > >>> On 14/08/12 14:57, David Vrabel wrote: > >>>> On 14/08/12 13:24, Attilio Rao wrote: > >> After looking at it some more, I think this pv-ops is unnecessary. How > >> about the following patch to just remove it completely? > >> > >> I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning > >> is sound. > > > > Do you have more then 4G to dom0 on those boxes? > > I've tested with 6G now, both 64-bit and 32-bit with HIGHPTE. > > > It certainly fixed a serious crash at the time it was introduced, see > > http://marc.info/?l=linux-kernel&m=129901609503574 and > > http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big > > changed in kernel_physical_mapping_init, I think we still need it. > > Depending on the e820 of your test box, the kernel could crash (or not), > > possibly in different places. > > > >>>> Having said that, I couldn't immediately see where pages in (end, > >>>> pgt_buf_top] was getting set RO. Can you point me to where it's > >>>> done? > >>>> > >>> > >>> As mentioned in the comment, please look at xen_set_pte_init(). > >> > >> xen_set_pte_init() only ensures it doesn't set the PTE as writable if it > >> is already present and read-only. > > > > look at mask_rw_pte and read the threads linked above, unfortunately it > > is not that simple. > > Yes, I was remembering what 32-bit did here. > > The 64-bit version is a bit confused and it often ends up /not/ clearing > RW for the direct mapping of the pages in the pgt_buf because any > existing RW mappings will be used as-is. See phys_pte_init() which > checks for an existing mapping and only sets the PTE if it is not > already set. not all the pagetable pages might be already mapped, even if they are already hooked into the pagetable > pgd_populate(), pud_populate(), and pmd_populate() take care of clearing > RW for the newly allocated page table pages, so mask_rw_pte() only needs > to consider clearing RW for mappings of the the existing page table PFNs > which all lie outside the range (pt_buf_start, pt_buf_top]. > > This patch makes it more sensible, and makes removal of the pv-op safe > if pgt_buf lies outside the initial mapping. > > index 04c1f61..2fd5e86 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -1400,14 +1400,13 @@ static pte_t __init mask_rw_pte(pte_t *ptep, > pte_t pte) > unsigned long pfn = pte_pfn(pte); > > /* > - * If the new pfn is within the range of the newly allocated > - * kernel pagetable, and it isn't being mapped into an > - * early_ioremap fixmap slot as a freshly allocated page, make sure > - * it is RO. > + * If this is a PTE of an early_ioremap fixmap slot but > + * outside the range (pgt_buf_start, pgt_buf_top], then this > + * PTE is mapping a PFN in the current page table. Make > + * sure it is RO. > */ > - if (((!is_early_ioremap_ptep(ptep) && > - pfn >= pgt_buf_start && pfn < pgt_buf_top)) || > - (is_early_ioremap_ptep(ptep) && pfn != (pgt_buf_end - > 1))) > + if (is_early_ioremap_ptep(ptep) > + && (pfn < pgt_buf_start || pfn >= pgt_buf_top)) > pte = pte_wrprotect(pte); > > return pte; > That's a mistake, you just inverted the condition! What if map_low_page is used to map a page already hooked into the pagetable? It should be RO while with your change it would be RW. Also you don't handle the case when map_low_page is used to map the very latest page, allocated and mapped by alloc_low_page, but still not hooked into the pagetable: that page needs to be RW. I believe you need more RAM than 6G to see all these issues. > >> 8<---------------------- > >> x86: remove x86_init.mapping.pagetable_reserve paravirt op > >> > >> The x86_init.mapping.pagetable_reserve paravirt op is used for Xen > >> guests to set the writable flag for the mapping of (pgt_buf_end, > >> pgt_buf_top]. This is not necessary as these pages are never set as > >> read-only as they have never contained page tables. > > > > Is this actually true? It is possible when pagetable pages are > > allocated by alloc_low_page. > > These newly allocated page table pages will be set read-only when they > are linked into the page tables with pgd_populate(), pud_populate() and > friends. > > >> When running as a Xen guest, the initial page tables are provided by > >> Xen (these are reserved with memblock_reserve() in > >> xen_setup_kernel_pagetable()) and constructed in brk space (for 32-bit > >> guests) or in the kernel's .data section (for 64-bit guests, see > >> head_64.S). > >> > >> Since these are all marked as reserved, (pgt_buf_start, pgt_buf_top] > >> does not overlap with them and the mappings for these PFNs will be > >> read-write. > > > > We are talking about pagetable pages built by > > kernel_physical_mapping_init. > > > > > >> Since Xen doesn't need to change the mapping its implementation > >> becomes the same as a native and we can simply remove this pv-op > >> completely. > > > > I don't think so. > > David > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |