[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m



Tim, see below re nested-SVM code:
> Hi,
>
> This mostly looks good; I have a few comments on the detail below.
> If those are addressed I hope I can apply the next version.
>
> Any other maintainers want to object to this renaming?
>
> At 22:28 -0500 on 07 Nov (1320704913), Andres Lagar-Cavilla wrote:
>> @@ -1182,6 +1195,7 @@ int hvm_hap_nested_page_fault(unsigned l
>>      mfn_t mfn;
>>      struct vcpu *v = current;
>>      struct p2m_domain *p2m;
>> +    int rc;
>>
>>      /* On Nested Virtualization, walk the guest page table.
>>       * If this succeeds, all is fine.
>> @@ -1255,8 +1269,8 @@ int hvm_hap_nested_page_fault(unsigned l
>>          if ( violation )
>>          {
>>              p2m_mem_access_check(gpa, gla_valid, gla, access_r,
>> access_w, access_x);
>> -
>> -            return 1;
>> +            rc = 1;
>> +            goto out_put_gfn;
>
> Shouldn't this patch be changing the call to gfn_to_mfn_type_p2m() just
> above here to a get_gfn_*() call too?
>
>> -void hvm_unmap_guest_frame(void *p)
>> +void hvm_unmap_guest_frame(void *p, unsigned long addr, int is_va)
>>  {
>> +    /* We enter this function with a map obtained in
>> __hvm_map_guest_frame.
>> +     * This map performed a p2m query that locked the gfn entry and got
>> +     * a ref on the mfn. Must undo */
>>      if ( p )
>> +    {
>> +        unsigned long gfn = INVALID_GFN;
>> +
>> +        if ( is_va )
>> +        {
>> +            if ( addr )
>> +            {
>> +                uint32_t pfec = PFEC_page_present;
>> +                gfn = paging_gva_to_gfn(current, addr, &pfec);
>> +            } else {
>> +                gfn = INVALID_GFN;
>> +            }
>> +        } else {
>> +            gfn = addr;
>> +        }
>> +
>> +        if ( gfn != INVALID_GFN )
>> +            put_gfn(current->domain, gfn);
>> +
>>          unmap_domain_page(p);
>> +    }
>>  }
>
> This is a pretty wierd-looking function now - I think it would be better
> just to make all callers have to remember the GFN.  In fact, since a
> guest VCPU can change its pagetables at any time, it's not safe to use
> paging_gva_to_gfn() to regenerate the GFN from a VA.
>
> Also, IIRC the nested-SVM code keeps the hvm_map mapping around for
> quite a long time so in fact this unmap-and-put-gfn may not be the right
> interface.  Maybe the caller should put_gfn() explicitly.

I just looked into nestedsvm_vmcb_map. It would seem that the map it
stores in nv_vvmcx survives beyond the vmexit. So, if we don't put_gfn we
would be entering the scheduler in an atomic context.

The shorthand solution is to put_gfn, but that would render the map
potentially unsafe on re-entry. I don't think so, so I'll push forward
with that.

Is it possible to teardown and re-establish the maps on each hypervisor
entry/exit? That would seem like the ultimate solution.

Andres
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/hvm/svm/svm.c
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -244,9 +244,10 @@ static int svm_vmcb_restore(struct vcpu
>>      {
>>          if ( c->cr0 & X86_CR0_PG )
>>          {
>> -            mfn = mfn_x(gfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>> +            mfn = mfn_x(get_gfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>>              if ( !p2m_is_ram(p2mt) || !get_page(mfn_to_page(mfn),
>> v->domain) )
>>              {
>> +                put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>                  gdprintk(XENLOG_ERR, "Invalid CR3 value=0x%"PRIx64"\n",
>>                           c->cr3);
>>                  return -EINVAL;
>> @@ -257,6 +258,8 @@ static int svm_vmcb_restore(struct vcpu
>>              put_page(pagetable_get_page(v->arch.guest_table));
>>
>>          v->arch.guest_table = pagetable_from_pfn(mfn);
>> +        if ( c->cr0 & X86_CR0_PG )
>> +            put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>      }
>>
>>      v->arch.hvm_vcpu.guest_cr[0] = c->cr0 | X86_CR0_ET;
>> @@ -1144,7 +1147,6 @@ static void svm_do_nested_pgfault(struct
>>      unsigned long gfn = gpa >> PAGE_SHIFT;
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>> -    p2m_access_t p2ma;
>>      struct p2m_domain *p2m = NULL;
>>
>>      ret = hvm_hap_nested_page_fault(gpa, 0, ~0ul, 0, 0, 0, 0);
>> @@ -1161,7 +1163,8 @@ static void svm_do_nested_pgfault(struct
>>          p2m = p2m_get_p2m(v);
>>          _d.gpa = gpa;
>>          _d.qualification = 0;
>> -        _d.mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &_d.p2mt, &p2ma,
>> p2m_query, NULL));
>> +        mfn = get_gfn_query_unlocked(p2m->domain, gfn, &_d.p2mt);
>> +        _d.mfn = mfn_x(mfn);
>
> Ah - this is not quite the same thing.  This lookup uses a specific p2m
> table (possibly a nested-p2m table reflecting the fact the guest is
> running in nested SVM mode) so it can't be replaced by a call that just
> takes the domain pointer (and will internally use the domain's master
> p2m table).
>
> The naming of 'p2m_get_p2m()' is particularly unhelpful here, I realise.
> It fetches the running p2m, i.e. the one that hardware sees as an NPT
> table; almost everywhere else uses p2m_get_hostp2m(), which
> fetches the master p2m.  But in general when you see the
> gfn_to_mfn_*_p2m() functions, that take an explicit p2m pointer, you
> need to be careful.
>
>>
>>          __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
>>      }
>> @@ -1181,7 +1184,7 @@ static void svm_do_nested_pgfault(struct
>>      if ( p2m == NULL )
>>          p2m = p2m_get_p2m(v);
>>      /* Everything else is an error. */
>> -    mfn = gfn_to_mfn_type_p2m(p2m, gfn, &p2mt, &p2ma, p2m_guest, NULL);
>> +    mfn = get_gfn_guest_unlocked(p2m->domain, gfn, &p2mt);
>
> Likewise here.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/guest_walk.c
>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -86,30 +86,33 @@ static uint32_t set_ad_bits(void *guest_
>>      return 0;
>>  }
>>
>> +/* If the map is non-NULL, we leave this function having
>> + * called get_gfn, you need to call put_gfn. */
>>  static inline void *map_domain_gfn(struct p2m_domain *p2m,
>>                                     gfn_t gfn,
>>                                     mfn_t *mfn,
>>                                     p2m_type_t *p2mt,
>>                                     uint32_t *rc)
>>  {
>> -    p2m_access_t a;
>> -
>>      /* Translate the gfn, unsharing if shared */
>> -    *mfn = gfn_to_mfn_type_p2m(p2m, gfn_x(gfn), p2mt, &a, p2m_unshare,
>> NULL);
>> +    *mfn = get_gfn_unshare(p2m->domain, gfn_x(gfn), p2mt);
>
> Here's another case where we need to handle the nested-hap path; the p2m
> pointer here must be propagated into the lookup function.
>
>>      if ( p2m_is_paging(*p2mt) )
>>      {
>>          ASSERT(!p2m_is_nestedp2m(p2m));
>>          p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_PAGED;
>>          return NULL;
>>      }
>>      if ( p2m_is_shared(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_SHARED;
>>          return NULL;
>>      }
>>      if ( !p2m_is_ram(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc |= _PAGE_PRESENT;
>>          return NULL;
>>      }
>> @@ -120,6 +123,9 @@ static inline void *map_domain_gfn(struc
>>
>>
>>  /* Walk the guest pagetables, after the manner of a hardware walker. */
>> +/* Because the walk is essentially random, it can cause a deadlock
>> + * warning in the p2m locking code. Highly unlikely this is an actual
>> + * deadlock, because who would walk page table in the opposite order?
>> */
>
> Linear pagetables make this sort of thing more likely, especially if
> they're used by one process to update another process's mappings.
>
> If this is a problem, maybe we'll need to avoid the deadlock either by
> allowing multiple lock-holders in the p2m or by rearranging the a/d
> writeback soit does another p2m lookup  -- I'd rather not do that,
> though, since even on 64-bit it will add a lot of memory accesses.
>
> I'm happy to take a version of this big switchover patch that doesn't
> solve this problem, though.  It can be sorted out in a separate patch.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/hap/nested_hap.c
>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>> @@ -146,22 +146,29 @@ nestedhap_walk_L0_p2m(struct p2m_domain
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>>      p2m_access_t p2ma;
>> +    int rc;
>>
>>      /* walk L0 P2M table */
>>      mfn = gfn_to_mfn_type_p2m(p2m, L1_gpa >> PAGE_SHIFT, &p2mt, &p2ma,
>>                                p2m_query, page_order);
>
> Again, are we not changing this function's name?
>
>
> Cheers,
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.