[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes for PV mem_access



>>> On 08.07.14 at 04:50, <aravindp@xxxxxxxxx> wrote:
> +/* Get the page permission of the mfn from page_info->shadow_flags */
> +static mfn_t
> +p2m_mem_access_get_entry(struct p2m_domain *p2m, unsigned long gfn,
> +                         p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
> +                         unsigned int *page_order)
> +{
> +    struct domain *d = p2m->domain;
> +    /* For PV guests mfn == gfn */

Valid, but why is this not being checked in any way in the earlier
"set" counterpart?

> +    mfn_t mfn = _mfn(gfn);
> +    struct page_info *page = mfn_to_page(mfn);
> +
> +    ASSERT(shadow_mode_enabled(d));
> +
> +    *t = p2m_ram_rw;
> +
> +    if ( page_get_owner(page) != d )

I think there's a mfn_valid() check missing prior to this re-ref of page.

> @@ -1356,7 +1357,7 @@ void shadow_prealloc(struct domain *d, u32 type, 
> unsigned int count)
>  
>  /* Deliberately free all the memory we can: this will tear down all of
>   * this domain's shadows */
> -static void shadow_blow_tables(struct domain *d) 
> +void shadow_blow_tables(struct domain *d)

This doesn't belong here - the patch doesn't add any new/external
caller of this function.

> @@ -2435,15 +2436,20 @@ int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn)
>      /* If that didn't catch the mapping, something is very wrong */
>      if ( !sh_check_page_has_no_refs(page) )
>      {
> -        /* Don't complain if we're in HVM and there are some extra mappings: 
> +        /*
> +         * Don't complain if we're in HVM and there are some extra mappings:
>           * The qemu helper process has an untyped mapping of this dom's RAM 
> 
>           * and the HVM restore program takes another.
>           * Also allow one typed refcount for xenheap pages, to match
> -         * share_xen_page_with_guest(). */
> +         * share_xen_page_with_guest().
> +         * PV domains that have a mem_access listener, runs in shadow mode
> +         * without refcounts.
> +         */
>          if ( !(shadow_mode_external(v->domain)
>                 && (page->count_info & PGC_count_mask) <= 3
>                 && ((page->u.inuse.type_info & PGT_count_mask)
> -                   == !!is_xen_heap_page(page))) )
> +                   == !!is_xen_heap_page(page))) &&
> +             !mem_event_check_ring(&v->domain->mem_event->access) )

To me this doesn't look to be in sync with the comment, as the new
check is being carried out regardless of domain type. Furthermore
this continues to have the problem of also hiding issues unrelated
to mem-access handling.

> @@ -3009,7 +3020,84 @@ static int sh_page_fault(struct vcpu *v,
>  
>      /* What mfn is the guest trying to access? */
>      gfn = guest_l1e_get_gfn(gw.l1e);
> -    gmfn = get_gfn(d, gfn, &p2mt);
> +    if ( likely(!mem_event_check_ring(&d->mem_event->access)) )
> +        gmfn = get_gfn(d, gfn, &p2mt);
> +    /*
> +     * A mem_access listener is present, so we will first check if a 
> violation
> +     * has occurred.
> +     */
> +    else
> +    {
> +        struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
> +        p2m_access_t p2ma;
> +
> +        gmfn = get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma, 0, NULL);
> +        if ( mfn_valid(gmfn) && !sh_mfn_is_a_page_table(gmfn)
> +             && regs->error_code & PFEC_page_present
> +             && !(regs->error_code & PFEC_reserved_bit) )
> +        {
> +            int violation = 0;
> +            bool_t access_w = !!(regs->error_code & PFEC_write_access);
> +            bool_t access_x = !!(regs->error_code & PFEC_insn_fetch);
> +            bool_t access_r = access_x ? 0 : !access_w;

"violation" looks to be a boolean just like the other three, so wants
to also be bool_t.

> +
> +            /* If the access is against the permissions, then send to 
> mem_event */
> +            switch ( p2ma )
> +            {
> +            case p2m_access_r:
> +                violation = access_w || access_x;
> +                break;
> +            case p2m_access_rx:
> +            case p2m_access_rx2rw:
> +                violation = access_w;
> +                break;
> +            case p2m_access_rw:
> +                violation = access_x;
> +                break;
> +            case p2m_access_rwx:
> +            default:
> +                break;
> +            }
> +
> +            /*
> +             * Do not police writes to guest memory from the Xen hypervisor.
> +             * This keeps PV mem_access on par with HVM. Turn off CR0.WP 
> here to
> +             * allow the write to go through if the guest has marked the 
> page as
> +             * writable. Turn it back on in the guest access functions
> +             * __copy_to_user / __put_user_size() after the write is 
> completed.
> +             */
> +            if ( violation && access_w &&
> +                 regs->eip >= XEN_VIRT_START && regs->eip <= XEN_VIRT_END )

Definitely < instead of <= on the right side. But - is this safe, the more
that this again doesn't appear to be sitting in a guest kind specific block?
I'd at least expect this to be qualified by a regs->cs and/or
guest_mode() check.

> +            {
> +                unsigned long cr0 = read_cr0();
> +
> +                violation = 0;
> +                if ( cr0 & X86_CR0_WP &&
> +                     guest_l1e_get_flags(gw.l1e) & _PAGE_RW )
> +                {
> +                    cr0 &= ~X86_CR0_WP;
> +                    write_cr0(cr0);
> +                    v->arch.pv_vcpu.need_cr0_wp_set = 1;

PV field access within a non-PV-only code block?

> +                }

I wonder how well the window where you're running with CR0.WP clear
is bounded: The flag serves as a kind of security measure, and hence
shouldn't be left off for extended periods of time.

> +            }
> +
> +            if ( violation )
> +            {
> +                paddr_t gpa = (mfn_x(gmfn) << PAGE_SHIFT) +
> +                              (va & ((1 << PAGE_SHIFT) - 1));
> +                if ( !p2m_mem_access_check(gpa, 1, va, access_r, access_w,
> +                                           access_x, &req_ptr) )
> +                {
> +                    SHADOW_PRINTK("Page access %c%c%c for gmfn=%"PRI_mfn" 
> p2ma: %d\n",
> +                                  (access_r ? 'r' : '-'),
> +                                  (access_w ? 'w' : '-'),
> +                                  (access_x ? 'x' : '-'), mfn_x(gmfn), p2ma);
> +                    /* Rights not promoted, vcpu paused, work here is done */
> +                    goto out_put_gfn;

Rather than re-using just two lines from the normal exit path and
introducing to it a mem-access specific code block, put the exit
processing here, at once allowing req_ptr to be limited in scope?

> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -43,6 +43,7 @@
>  #include <asm/page.h>
>  #include <asm/numa.h>
>  #include <asm/flushtlb.h>
> +#include <asm/shadow.h>

Breaking ARM?

>  #ifdef CONFIG_X86
>  #include <asm/p2m.h>
>  #include <asm/setup.h> /* for highmem_start only */
> @@ -1660,6 +1661,8 @@ int assign_pages(
>          page_set_owner(&pg[i], d);
>          smp_wmb(); /* Domain pointer must be visible before updating refcnt. 
> */
>          pg[i].count_info = PGC_allocated | 1;
> +        if ( is_pv_domain(d) )
> +            shadow_set_access(&pg[i], p2m_get_hostp2m(d)->default_access);

I don't think you should call shadow code from here.

> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -380,6 +380,12 @@ struct pv_vcpu
>      /* Deferred VA-based update state. */
>      bool_t need_update_runstate_area;
>      struct vcpu_time_info pending_system_time;
> +
> +    /*
> +     * Flag that tracks if CR0.WP needs to be set after a Xen write to guest
> +     * memory when a PV domain has a mem_access listener attached to it.
> +     */
> +    bool_t need_cr0_wp_set;
>  };

The new field would better go above need_update_runstate_area
for better space usage.

> --- a/xen/include/asm-x86/x86_64/uaccess.h
> +++ b/xen/include/asm-x86/x86_64/uaccess.h
> @@ -1,6 +1,8 @@
>  #ifndef __X86_64_UACCESS_H
>  #define __X86_64_UACCESS_H
>  
> +#include <xen/sched.h>

This is pretty ugly. You only reference the needed fields in a macro,
so you don't strictly need to include this here as long as all (or at least
most - the rest could be patched up) use sites include it.

> @@ -65,6 +67,11 @@ do {                                                       
>                 \
>       case 8: __put_user_asm(x,ptr,retval,"q","","ir",errret);break;  \
>       default: __put_user_bad();                                      \
>       }                                                               \
> +    if ( unlikely(current->arch.pv_vcpu.need_cr0_wp_set) ) \
> +    { \
> +        write_cr0(read_cr0() | X86_CR0_WP); \
> +        current->arch.pv_vcpu.need_cr0_wp_set = 0; \
> +    } \
>  } while (0)

Please obey to the original indentation method.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.