[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS



On 15/08/12 14:55, Stefano Stabellini wrote:
> On Wed, 15 Aug 2012, David Vrabel wrote:
>> On 14/08/12 15:12, Attilio Rao wrote:
>>> On 14/08/12 14:57, David Vrabel wrote:
>>>> On 14/08/12 13:24, Attilio Rao wrote:
>> After looking at it some more, I think this pv-ops is unnecessary. How
>> about the following patch to just remove it completely?
>>
>> I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning
>> is sound.
> 
> Do you have more then 4G to dom0 on those boxes?

I've tested with 6G now, both 64-bit and 32-bit with HIGHPTE.

> It certainly fixed a serious crash at the time it was introduced, see
> http://marc.info/?l=linux-kernel&m=129901609503574 and
> http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big
> changed in kernel_physical_mapping_init, I think we still need it.
> Depending on the e820 of your test box, the kernel could crash (or not),
> possibly in different places.
>
>>>> Having said that, I couldn't immediately see where pages in (end, 
>>>> pgt_buf_top] was getting set RO.  Can you point me to where it's 
>>>> done?
>>>>
>>>
>>> As mentioned in the comment, please look at xen_set_pte_init().
>>
>> xen_set_pte_init() only ensures it doesn't set the PTE as writable if it
>> is already present and read-only.
> 
> look at mask_rw_pte and read the threads linked above, unfortunately it
> is not that simple.

Yes, I was remembering what 32-bit did here.

The 64-bit version is a bit confused and it often ends up /not/ clearing
RW for the direct mapping of the pages in the pgt_buf because any
existing RW mappings will be used as-is.  See phys_pte_init() which
checks for an existing mapping and only sets the PTE if it is not
already set.

pgd_populate(), pud_populate(), and pmd_populate() take care of clearing
RW for the newly allocated page table pages, so mask_rw_pte() only needs
to consider clearing RW for mappings of the the existing page table PFNs
which all lie outside the range (pt_buf_start, pt_buf_top].

This patch makes it more sensible, and makes removal of the pv-op safe
if pgt_buf lies outside the initial mapping.

index 04c1f61..2fd5e86 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1400,14 +1400,13 @@ static pte_t __init mask_rw_pte(pte_t *ptep,
pte_t pte)
        unsigned long pfn = pte_pfn(pte);

        /*
-        * If the new pfn is within the range of the newly allocated
-        * kernel pagetable, and it isn't being mapped into an
-        * early_ioremap fixmap slot as a freshly allocated page, make sure
-        * it is RO.
+        * If this is a PTE of an early_ioremap fixmap slot but
+        * outside the range (pgt_buf_start, pgt_buf_top], then this
+        * PTE is mapping a PFN in the current page table.  Make
+        * sure it is RO.
         */
-       if (((!is_early_ioremap_ptep(ptep) &&
-                       pfn >= pgt_buf_start && pfn < pgt_buf_top)) ||
-                       (is_early_ioremap_ptep(ptep) && pfn != (pgt_buf_end - 
1)))
+       if (is_early_ioremap_ptep(ptep)
+           && (pfn < pgt_buf_start || pfn >= pgt_buf_top))
                pte = pte_wrprotect(pte);

        return pte;


>> 8<----------------------
>> x86: remove x86_init.mapping.pagetable_reserve paravirt op
>>
>> The x86_init.mapping.pagetable_reserve paravirt op is used for Xen
>> guests to set the writable flag for the mapping of (pgt_buf_end,
>> pgt_buf_top].  This is not necessary as these pages are never set as
>> read-only as they have never contained page tables.
> 
> Is this actually true? It is possible when pagetable pages are
> allocated by alloc_low_page.

These newly allocated page table pages will be set read-only when they
are linked into the page tables with pgd_populate(), pud_populate() and
friends.

>> When running as a Xen guest, the initial page tables are provided by
>> Xen (these are reserved with memblock_reserve() in
>> xen_setup_kernel_pagetable()) and constructed in brk space (for 32-bit
>> guests) or in the kernel's .data section (for 64-bit guests, see
>> head_64.S).
>>
>> Since these are all marked as reserved, (pgt_buf_start, pgt_buf_top]
>> does not overlap with them and the mappings for these PFNs will be
>> read-write.
> 
> We are talking about pagetable pages built by
> kernel_physical_mapping_init.
> 
> 
>> Since Xen doesn't need to change the mapping its implementation
>> becomes the same as a native and we can simply remove this pv-op
>> completely.
> 
> I don't think so.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.