[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0



Mel,

The NUMA_BALANCING series beginning with 5d833062139d (mm: numa: do not
dereference pmd outside of the lock during NUMA hinting fault) and
specifically 8a0516ed8b90 (mm: convert p[te|md]_numa users to
p[te|md]_protnone_numa) breaks Xen 64-bit PV guests.

Any fault on a present userspace mapping (e.g., a write to a read-only
mapping) is being misinterpreted as a NUMA hinting fault and not handled
correctly.  All userspace programs end up continuously  faulting.

This is because the hypervisor sets _PAGE_GLOBAL (== _PAGE_PROTNONE) on
all present userspace page table entries.

Note that the comment in asm/pgtable_types.h that says that
_PAGE_BIT_PROTNONE is only valid on non-present entries.

  /* If _PAGE_BIT_PRESENT is clear, we use these: */
  /* - if the user mapped it with PROT_NONE; pte_present gives true */
  #define _PAGE_BIT_PROTNONE    _PAGE_BIT_GLOBAL

Adjusting pte_protnone() and pmd_protnone() to check for the absence of
_PAGE_PRESENT allows 64-bit Xen PV guests to work correctly again (see
following patch), but I'm not sure if NUMA_BALANCING would correctly
work with this change.

David

8<---------------------------
x86: pte_protnone() and pmd_protnone() must check entry is
 not present

Since _PAGE_PROTNONE aliases _PAGE_GLOBAL it is only valid if
_PAGE_PRESENT is clear.  Make pte_protnone() and pmd_protnone() check
for this.

This fixes a 64-bit Xen PV guest regression introduced by
8a0516ed8b90c95ffa1363b420caa37418149f21 (mm: convert p[te|md]_numa
users to p[te|md]_protnone_numa).  Any userspace process would
endlessly fault.

In a 64-bit PV guest, userspace page table entries have _PAGE_GLOBAL
set by the hypervisor.  This meant that any fault on a present
userspace entry (e.g., a write to a read-only mapping) would be
misinterpreted as a NUMA hinting fault and the fault would not be
correctly handled, resulting in the access endlessly faulting.

Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
---
 arch/x86/include/asm/pgtable.h |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 67fc3d2..a0c35bf 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -476,12 +476,14 @@ static inline int pmd_present(pmd_t pmd)
  */
 static inline int pte_protnone(pte_t pte)
 {
-       return pte_flags(pte) & _PAGE_PROTNONE;
+       return (pte_flags(pte) & (_PAGE_PROTNONE | _PAGE_PRESENT))
+               == _PAGE_PROTNONE;
 }

 static inline int pmd_protnone(pmd_t pmd)
 {
-       return pmd_flags(pmd) & _PAGE_PROTNONE;
+       return (pmd_flags(pmd) & (_PAGE_PROTNONE | _PAGE_PRESENT))
+               == _PAGE_PROTNONE;
 }
 #endif /* CONFIG_NUMA_BALANCING */

-- 
1.7.10.4

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.