[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m



Hi, 

This mostly looks good; I have a few comments on the detail below. 
If those are addressed I hope I can apply the next version.

Any other maintainers want to object to this renaming?  

At 22:28 -0500 on 07 Nov (1320704913), Andres Lagar-Cavilla wrote:
> @@ -1182,6 +1195,7 @@ int hvm_hap_nested_page_fault(unsigned l
>      mfn_t mfn;
>      struct vcpu *v = current;
>      struct p2m_domain *p2m;
> +    int rc;
>  
>      /* On Nested Virtualization, walk the guest page table.
>       * If this succeeds, all is fine.
> @@ -1255,8 +1269,8 @@ int hvm_hap_nested_page_fault(unsigned l
>          if ( violation )
>          {
>              p2m_mem_access_check(gpa, gla_valid, gla, access_r, access_w, 
> access_x);
> -
> -            return 1;
> +            rc = 1;
> +            goto out_put_gfn;

Shouldn't this patch be changing the call to gfn_to_mfn_type_p2m() just
above here to a get_gfn_*() call too?

> -void hvm_unmap_guest_frame(void *p)
> +void hvm_unmap_guest_frame(void *p, unsigned long addr, int is_va)
>  {
> +    /* We enter this function with a map obtained in __hvm_map_guest_frame.
> +     * This map performed a p2m query that locked the gfn entry and got
> +     * a ref on the mfn. Must undo */
>      if ( p )
> +    {
> +        unsigned long gfn = INVALID_GFN;
> +
> +        if ( is_va )
> +        {
> +            if ( addr )
> +            {
> +                uint32_t pfec = PFEC_page_present;
> +                gfn = paging_gva_to_gfn(current, addr, &pfec);
> +            } else {
> +                gfn = INVALID_GFN;
> +            }
> +        } else {
> +            gfn = addr;
> +        }
> +
> +        if ( gfn != INVALID_GFN )
> +            put_gfn(current->domain, gfn);
> +
>          unmap_domain_page(p);
> +    }
>  }

This is a pretty wierd-looking function now - I think it would be better
just to make all callers have to remember the GFN.  In fact, since a
guest VCPU can change its pagetables at any time, it's not safe to use 
paging_gva_to_gfn() to regenerate the GFN from a VA. 

Also, IIRC the nested-SVM code keeps the hvm_map mapping around for
quite a long time so in fact this unmap-and-put-gfn may not be the right
interface.  Maybe the caller should put_gfn() explicitly.

> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/hvm/svm/svm.c
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -244,9 +244,10 @@ static int svm_vmcb_restore(struct vcpu 
>      {
>          if ( c->cr0 & X86_CR0_PG )
>          {
> -            mfn = mfn_x(gfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT, &p2mt));
> +            mfn = mfn_x(get_gfn(v->domain, c->cr3 >> PAGE_SHIFT, &p2mt));
>              if ( !p2m_is_ram(p2mt) || !get_page(mfn_to_page(mfn), v->domain) 
> )
>              {
> +                put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>                  gdprintk(XENLOG_ERR, "Invalid CR3 value=0x%"PRIx64"\n",
>                           c->cr3);
>                  return -EINVAL;
> @@ -257,6 +258,8 @@ static int svm_vmcb_restore(struct vcpu 
>              put_page(pagetable_get_page(v->arch.guest_table));
>  
>          v->arch.guest_table = pagetable_from_pfn(mfn);
> +        if ( c->cr0 & X86_CR0_PG )
> +            put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>      }
>  
>      v->arch.hvm_vcpu.guest_cr[0] = c->cr0 | X86_CR0_ET;
> @@ -1144,7 +1147,6 @@ static void svm_do_nested_pgfault(struct
>      unsigned long gfn = gpa >> PAGE_SHIFT;
>      mfn_t mfn;
>      p2m_type_t p2mt;
> -    p2m_access_t p2ma;
>      struct p2m_domain *p2m = NULL;
>  
>      ret = hvm_hap_nested_page_fault(gpa, 0, ~0ul, 0, 0, 0, 0);
> @@ -1161,7 +1163,8 @@ static void svm_do_nested_pgfault(struct
>          p2m = p2m_get_p2m(v);
>          _d.gpa = gpa;
>          _d.qualification = 0;
> -        _d.mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &_d.p2mt, &p2ma, 
> p2m_query, NULL));
> +        mfn = get_gfn_query_unlocked(p2m->domain, gfn, &_d.p2mt);
> +        _d.mfn = mfn_x(mfn);

Ah - this is not quite the same thing.  This lookup uses a specific p2m
table (possibly a nested-p2m table reflecting the fact the guest is
running in nested SVM mode) so it can't be replaced by a call that just
takes the domain pointer (and will internally use the domain's master
p2m table). 

The naming of 'p2m_get_p2m()' is particularly unhelpful here, I realise.
It fetches the running p2m, i.e. the one that hardware sees as an NPT
table; almost everywhere else uses p2m_get_hostp2m(), which
fetches the master p2m.  But in general when you see the
gfn_to_mfn_*_p2m() functions, that take an explicit p2m pointer, you
need to be careful.

>          
>          __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
>      }
> @@ -1181,7 +1184,7 @@ static void svm_do_nested_pgfault(struct
>      if ( p2m == NULL )
>          p2m = p2m_get_p2m(v);
>      /* Everything else is an error. */
> -    mfn = gfn_to_mfn_type_p2m(p2m, gfn, &p2mt, &p2ma, p2m_guest, NULL);
> +    mfn = get_gfn_guest_unlocked(p2m->domain, gfn, &p2mt);

Likewise here. 

> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/guest_walk.c
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -86,30 +86,33 @@ static uint32_t set_ad_bits(void *guest_
>      return 0;
>  }
>  
> +/* If the map is non-NULL, we leave this function having 
> + * called get_gfn, you need to call put_gfn. */
>  static inline void *map_domain_gfn(struct p2m_domain *p2m,
>                                     gfn_t gfn, 
>                                     mfn_t *mfn,
>                                     p2m_type_t *p2mt,
>                                     uint32_t *rc) 
>  {
> -    p2m_access_t a;
> -
>      /* Translate the gfn, unsharing if shared */
> -    *mfn = gfn_to_mfn_type_p2m(p2m, gfn_x(gfn), p2mt, &a, p2m_unshare, NULL);
> +    *mfn = get_gfn_unshare(p2m->domain, gfn_x(gfn), p2mt);

Here's another case where we need to handle the nested-hap path; the p2m
pointer here must be propagated into the lookup function. 

>      if ( p2m_is_paging(*p2mt) )
>      {
>          ASSERT(!p2m_is_nestedp2m(p2m));
>          p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
> +        put_gfn(p2m->domain, gfn_x(gfn));
>          *rc = _PAGE_PAGED;
>          return NULL;
>      }
>      if ( p2m_is_shared(*p2mt) )
>      {
> +        put_gfn(p2m->domain, gfn_x(gfn));
>          *rc = _PAGE_SHARED;
>          return NULL;
>      }
>      if ( !p2m_is_ram(*p2mt) ) 
>      {
> +        put_gfn(p2m->domain, gfn_x(gfn));
>          *rc |= _PAGE_PRESENT;
>          return NULL;
>      }
> @@ -120,6 +123,9 @@ static inline void *map_domain_gfn(struc
>  
>  
>  /* Walk the guest pagetables, after the manner of a hardware walker. */
> +/* Because the walk is essentially random, it can cause a deadlock 
> + * warning in the p2m locking code. Highly unlikely this is an actual
> + * deadlock, because who would walk page table in the opposite order? */

Linear pagetables make this sort of thing more likely, especially if
they're used by one process to update another process's mappings.

If this is a problem, maybe we'll need to avoid the deadlock either by
allowing multiple lock-holders in the p2m or by rearranging the a/d
writeback soit does another p2m lookup  -- I'd rather not do that,
though, since even on 64-bit it will add a lot of memory accesses.

I'm happy to take a version of this big switchover patch that doesn't
solve this problem, though.  It can be sorted out in a separate patch.

> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/hap/nested_hap.c
> --- a/xen/arch/x86/mm/hap/nested_hap.c
> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> @@ -146,22 +146,29 @@ nestedhap_walk_L0_p2m(struct p2m_domain 
>      mfn_t mfn;
>      p2m_type_t p2mt;
>      p2m_access_t p2ma;
> +    int rc;
>  
>      /* walk L0 P2M table */
>      mfn = gfn_to_mfn_type_p2m(p2m, L1_gpa >> PAGE_SHIFT, &p2mt, &p2ma, 
>                                p2m_query, page_order);

Again, are we not changing this function's name?


Cheers,

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.