[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS
On 16/08/12 12:07, Stefano Stabellini wrote: > On Wed, 15 Aug 2012, David Vrabel wrote: >> On 15/08/12 14:55, Stefano Stabellini wrote: >>> On Wed, 15 Aug 2012, David Vrabel wrote: >>>> On 14/08/12 15:12, Attilio Rao wrote: >>>>> On 14/08/12 14:57, David Vrabel wrote: >>>>>> On 14/08/12 13:24, Attilio Rao wrote: >>>> After looking at it some more, I think this pv-ops is unnecessary. How >>>> about the following patch to just remove it completely? >>>> >>>> I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning >>>> is sound. >>> >>> Do you have more then 4G to dom0 on those boxes? >> >> I've tested with 6G now, both 64-bit and 32-bit with HIGHPTE. >> >>> It certainly fixed a serious crash at the time it was introduced, see >>> http://marc.info/?l=linux-kernel&m=129901609503574 and >>> http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big >>> changed in kernel_physical_mapping_init, I think we still need it. >>> Depending on the e820 of your test box, the kernel could crash (or not), >>> possibly in different places. FYI, it looks like pgt_buf is now located low down, which is why these changes worked for me. Possibly this changed as part of a memblock refactor. >>>>>> Having said that, I couldn't immediately see where pages in (end, >>>>>> pgt_buf_top] was getting set RO. Can you point me to where it's >>>>>> done? >>>>>> >>>>> >>>>> As mentioned in the comment, please look at xen_set_pte_init(). >>>> >>>> xen_set_pte_init() only ensures it doesn't set the PTE as writable if it >>>> is already present and read-only. >>> >>> look at mask_rw_pte and read the threads linked above, unfortunately it >>> is not that simple. >> >> Yes, I was remembering what 32-bit did here. >> >> The 64-bit version is a bit confused and it often ends up /not/ clearing >> RW for the direct mapping of the pages in the pgt_buf because any >> existing RW mappings will be used as-is. See phys_pte_init() which >> checks for an existing mapping and only sets the PTE if it is not >> already set. > > not all the pagetable pages might be already mapped, even if they are > already hooked into the pagetable Yes, I think this is easy to handle though. /* * Make sure that active page table pages are not mapped RW. */ if (is_early_ioremap_ptep(ptep)) { /* * If we're updating an early ioremap PTE, then this PFN may * already be in the linear mapping. If it is, use the * existing RW bit. */ unsigned int level; pte_t *linear_pte; linear_pte = lookup_address(__va(PFN_PHYS(pfn)), &level); if (linear_pte && !pte_write(*linear_pt)) pte = pte_wrprotect(pte); } else if (pfn >= pgt_buf_start && pfn < pgt_buf_end) { /* * The PFN may not be mapped but may be hooked into the page * tables. Make sure this new mapping is read-only. */ pte = pte_wrprotect(pte); } However, the real subtlety are page tables that are mapped as they themselves are hooked in. As as example, let's say pgt_buf_start (s) is on a 1 GiB boundary and pgt_buf_top (t) is below the next 2 MiB boundary. We're mapping memory with 4 KiB pages from s to s + 4 MiB. This requires a new PUD and two new PMDs. To map this region: 1. Allocate a new PUD. (@ e = pgt_buf_end) 2. Allocate a new PMD. (@ e + 1) 3. Fill in this PMD's PTEs. This covers pgt_buf so sets (s, e + 1) as RO. 4. Call pmd_populate() to hook-in this PMD. This does not set e+1 as RO (but this is ok as it already is). 5. Allocate a new PMD (@ e + 2) 6. Fill in this PMD's PTEs. 7. Call pmd_populate() to hook in this PMD. This does not set e+2 as RO as the region isn't mapped yet. 8. Call pud_populate() to hook in this PUD. This sets e as RO but e + 2 is still RW. 9. Boom! It may be possible to fixup the permissions as the pages are hooked in. i.e., if this page's entries cover (pgt_buf_start, pgt_buf_end], walk the entries and any child tables and fixup the permissions of the leaf entries. This would walk the tables a few times unless we were careful to only walk them when hooking a page into an active table. It was fun trying to understand this, but I think I'll give up now... David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |