Xen project Mailing List

Re: [Xen-devel] [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes for PV mem_access

To: "Aravindh Puthiyaparambil" <aravindp@xxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 24 Jul 2014 15:29:11 +0100

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Keir Fraser <keir@xxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>

Delivery-date: Thu, 24 Jul 2014 14:29:33 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 08.07.14 at 04:50, <aravindp@xxxxxxxxx> wrote: > +/* Get the page permission of the mfn from page_info->shadow_flags */ > +static mfn_t > +p2m_mem_access_get_entry(struct p2m_domain *p2m, unsigned long gfn, > + p2m_type_t *t, p2m_access_t *a, p2m_query_t q, > + unsigned int *page_order) > +{ > + struct domain *d = p2m->domain; > + /* For PV guests mfn == gfn */ Valid, but why is this not being checked in any way in the earlier "set" counterpart? > + mfn_t mfn = _mfn(gfn); > + struct page_info *page = mfn_to_page(mfn); > + > + ASSERT(shadow_mode_enabled(d)); > + > + *t = p2m_ram_rw; > + > + if ( page_get_owner(page) != d ) I think there's a mfn_valid() check missing prior to this re-ref of page. > @@ -1356,7 +1357,7 @@ void shadow_prealloc(struct domain *d, u32 type, > unsigned int count) > > /* Deliberately free all the memory we can: this will tear down all of > * this domain's shadows */ > -static void shadow_blow_tables(struct domain *d) > +void shadow_blow_tables(struct domain *d) This doesn't belong here - the patch doesn't add any new/external caller of this function. > @@ -2435,15 +2436,20 @@ int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn) > /* If that didn't catch the mapping, something is very wrong */ > if ( !sh_check_page_has_no_refs(page) ) > { > - /* Don't complain if we're in HVM and there are some extra mappings: > + /* > + * Don't complain if we're in HVM and there are some extra mappings: > * The qemu helper process has an untyped mapping of this dom's RAM > > * and the HVM restore program takes another. > * Also allow one typed refcount for xenheap pages, to match > - * share_xen_page_with_guest(). */ > + * share_xen_page_with_guest(). > + * PV domains that have a mem_access listener, runs in shadow mode > + * without refcounts. > + */ > if ( !(shadow_mode_external(v->domain) > && (page->count_info & PGC_count_mask) <= 3 > && ((page->u.inuse.type_info & PGT_count_mask) > - == !!is_xen_heap_page(page))) ) > + == !!is_xen_heap_page(page))) && > + !mem_event_check_ring(&v->domain->mem_event->access) ) To me this doesn't look to be in sync with the comment, as the new check is being carried out regardless of domain type. Furthermore this continues to have the problem of also hiding issues unrelated to mem-access handling. > @@ -3009,7 +3020,84 @@ static int sh_page_fault(struct vcpu *v, > > /* What mfn is the guest trying to access? */ > gfn = guest_l1e_get_gfn(gw.l1e); > - gmfn = get_gfn(d, gfn, &p2mt); > + if ( likely(!mem_event_check_ring(&d->mem_event->access)) ) > + gmfn = get_gfn(d, gfn, &p2mt); > + /* > + * A mem_access listener is present, so we will first check if a > violation > + * has occurred. > + */ > + else > + { > + struct p2m_domain *p2m = p2m_get_hostp2m(v->domain); > + p2m_access_t p2ma; > + > + gmfn = get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma, 0, NULL); > + if ( mfn_valid(gmfn) && !sh_mfn_is_a_page_table(gmfn) > + && regs->error_code & PFEC_page_present > + && !(regs->error_code & PFEC_reserved_bit) ) > + { > + int violation = 0; > + bool_t access_w = !!(regs->error_code & PFEC_write_access); > + bool_t access_x = !!(regs->error_code & PFEC_insn_fetch); > + bool_t access_r = access_x ? 0 : !access_w; "violation" looks to be a boolean just like the other three, so wants to also be bool_t. > + > + /* If the access is against the permissions, then send to > mem_event */ > + switch ( p2ma ) > + { > + case p2m_access_r: > + violation = access_w || access_x; > + break; > + case p2m_access_rx: > + case p2m_access_rx2rw: > + violation = access_w; > + break; > + case p2m_access_rw: > + violation = access_x; > + break; > + case p2m_access_rwx: > + default: > + break; > + } > + > + /* > + * Do not police writes to guest memory from the Xen hypervisor. > + * This keeps PV mem_access on par with HVM. Turn off CR0.WP > here to > + * allow the write to go through if the guest has marked the > page as > + * writable. Turn it back on in the guest access functions > + * __copy_to_user / __put_user_size() after the write is > completed. > + */ > + if ( violation && access_w && > + regs->eip >= XEN_VIRT_START && regs->eip <= XEN_VIRT_END ) Definitely < instead of <= on the right side. But - is this safe, the more that this again doesn't appear to be sitting in a guest kind specific block? I'd at least expect this to be qualified by a regs->cs and/or guest_mode() check. > + { > + unsigned long cr0 = read_cr0(); > + > + violation = 0; > + if ( cr0 & X86_CR0_WP && > + guest_l1e_get_flags(gw.l1e) & _PAGE_RW ) > + { > + cr0 &= ~X86_CR0_WP; > + write_cr0(cr0); > + v->arch.pv_vcpu.need_cr0_wp_set = 1; PV field access within a non-PV-only code block? > + } I wonder how well the window where you're running with CR0.WP clear is bounded: The flag serves as a kind of security measure, and hence shouldn't be left off for extended periods of time. > + } > + > + if ( violation ) > + { > + paddr_t gpa = (mfn_x(gmfn) << PAGE_SHIFT) + > + (va & ((1 << PAGE_SHIFT) - 1)); > + if ( !p2m_mem_access_check(gpa, 1, va, access_r, access_w, > + access_x, &req_ptr) ) > + { > + SHADOW_PRINTK("Page access %c%c%c for gmfn=%"PRI_mfn" > p2ma: %d\n", > + (access_r ? 'r' : '-'), > + (access_w ? 'w' : '-'), > + (access_x ? 'x' : '-'), mfn_x(gmfn), p2ma); > + /* Rights not promoted, vcpu paused, work here is done */ > + goto out_put_gfn; Rather than re-using just two lines from the normal exit path and introducing to it a mem-access specific code block, put the exit processing here, at once allowing req_ptr to be limited in scope? > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -43,6 +43,7 @@ > #include <asm/page.h> > #include <asm/numa.h> > #include <asm/flushtlb.h> > +#include <asm/shadow.h> Breaking ARM? > #ifdef CONFIG_X86 > #include <asm/p2m.h> > #include <asm/setup.h> /* for highmem_start only */ > @@ -1660,6 +1661,8 @@ int assign_pages( > page_set_owner(&pg[i], d); > smp_wmb(); /* Domain pointer must be visible before updating refcnt. > */ > pg[i].count_info = PGC_allocated | 1; > + if ( is_pv_domain(d) ) > + shadow_set_access(&pg[i], p2m_get_hostp2m(d)->default_access); I don't think you should call shadow code from here. > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -380,6 +380,12 @@ struct pv_vcpu > /* Deferred VA-based update state. */ > bool_t need_update_runstate_area; > struct vcpu_time_info pending_system_time; > + > + /* > + * Flag that tracks if CR0.WP needs to be set after a Xen write to guest > + * memory when a PV domain has a mem_access listener attached to it. > + */ > + bool_t need_cr0_wp_set; > }; The new field would better go above need_update_runstate_area for better space usage. > --- a/xen/include/asm-x86/x86_64/uaccess.h > +++ b/xen/include/asm-x86/x86_64/uaccess.h > @@ -1,6 +1,8 @@ > #ifndef __X86_64_UACCESS_H > #define __X86_64_UACCESS_H > > +#include <xen/sched.h> This is pretty ugly. You only reference the needed fields in a macro, so you don't strictly need to include this here as long as all (or at least most - the rest could be patched up) use sites include it. > @@ -65,6 +67,11 @@ do { > \ > case 8: __put_user_asm(x,ptr,retval,"q","","ir",errret);break; \ > default: __put_user_bad(); \ > } \ > + if ( unlikely(current->arch.pv_vcpu.need_cr0_wp_set) ) \ > + { \ > + write_cr0(read_cr0() | X86_CR0_WP); \ > + current->arch.pv_vcpu.need_cr0_wp_set = 0; \ > + } \ > } while (0) Please obey to the original indentation method. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.