[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Introduce relinquish_p2m_mapping to remove refcount every mapped page



On Tue, 2013-12-10 at 01:31 +0000, Julien Grall wrote:
> 
> On 12/09/2013 04:28 PM, Ian Campbell wrote:
> > On Mon, 2013-12-09 at 03:34 +0000, Julien Grall wrote:
> >> This function will be called when the domain relinquishes its memory.
> >> It removes refcount on every mapped page to a valid MFN.
> >>
> >> Currently, Xen doesn't take refcount on every new mapping but only for 
> >> foreign
> >> mapping. Restrict the function only on foreign mapping.
> >
> > Skimming the remainder of the patch's titles and recalling a previous
> > conversation the intention is not to extend this for 4.4, correct?
> 
> Right, it's too big for Xen 4.4.
> 
> >>
> >> Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
> >>
> >> ---
> >>      Changes in v2:
> >>          - Introduce the patch
> >> ---
> >>   xen/arch/arm/domain.c        |    5 +++++
> >>   xen/arch/arm/p2m.c           |   47 
> >> ++++++++++++++++++++++++++++++++++++++++++
> >>   xen/include/asm-arm/domain.h |    1 +
> >>   xen/include/asm-arm/p2m.h    |   15 ++++++++++++++
> >>   4 files changed, 68 insertions(+)
> >>
> >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> >> index 1590708..e7c2f67 100644
> >> --- a/xen/arch/arm/domain.c
> >> +++ b/xen/arch/arm/domain.c
> >> @@ -717,6 +717,11 @@ int domain_relinquish_resources(struct domain *d)
> >>           if ( ret )
> >>               return ret;
> >>
> >> +    case RELMEM_mapping:
> >
> > Something somewhere should be setting d->arch.relmem = RELMEM_mapping at
> > the appropriate time. (immediately above I think?)
> >
> > You also want a "Fallthrough" comment just above.
> 
> Oops, I will update the patch for the next version.
> 
> >> +        ret = relinquish_p2m_mapping(d);
> >> +        if ( ret )
> >> +            return ret;
> >> +
> >>           d->arch.relmem = RELMEM_done;
> >>           /* Fallthrough */
> >>
> >> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> >> index f0bbaca..dbd6a06 100644
> >> --- a/xen/arch/arm/p2m.c
> >> +++ b/xen/arch/arm/p2m.c
> >> @@ -6,6 +6,7 @@
> >>   #include <xen/bitops.h>
> >>   #include <asm/flushtlb.h>
> >>   #include <asm/gic.h>
> >> +#include <asm/event.h>
> >>
> >>   /* First level P2M is 2 consecutive pages */
> >>   #define P2M_FIRST_ORDER 1
> >> @@ -320,6 +321,16 @@ static int create_p2m_entries(struct domain *d,
> >>               flush_tlb_all_local();
> >>       }
> >>
> >> +    if ( (t == p2m_ram_rw) || (t == p2m_ram_ro) || (t == p2m_map_foreign))
> >> +    {
> >> +        unsigned long sgfn = paddr_to_pfn(start_gpaddr);
> >> +        unsigned long egfn = paddr_to_pfn(end_gpaddr);
> >> +
> >> +        p2m->max_mapped_gfn = MAX(p2m->max_mapped_gfn, egfn);
> >> +        /* Use next_gfn_to_relinquish to store the lowest gfn mapped */
> >> +        p2m->next_gfn_to_relinquish = MIN(p2m->next_gfn_to_relinquish, 
> >> sgfn);
> >> +    }
> >> +
> >>       rc = 0;
> >>
> >>   out:
> >> @@ -503,12 +514,48 @@ int p2m_init(struct domain *d)
> >>
> >>       p2m->first_level = NULL;
> >>
> >> +    p2m->max_mapped_gfn = 0;
> >> +    p2m->next_gfn_to_relinquish = ULONG_MAX;
> >> +
> >>   err:
> >>       spin_unlock(&p2m->lock);
> >>
> >>       return rc;
> >>   }
> >>
> >> +int relinquish_p2m_mapping(struct domain *d)
> >> +{
> >> +    struct p2m_domain *p2m = &d->arch.p2m;
> >> +    unsigned long gfn, count = 0;
> >> +    int rc = 0;
> >> +
> >> +    for ( gfn = p2m->next_gfn_to_relinquish;
> >> +          gfn < p2m->max_mapped_gfn; gfn++ )
> >
> > I know that Tim has been keen to get rid of these sorts of loops on x86,
> > and with good reason I think.
> >
> >> +    {
> >> +        p2m_type_t t;
> >> +        paddr_t p = p2m_lookup(d, gfn, &t);
> >
> > This does the full walk for each address, even though 2/3 of the levels
> > are more than likely identical to the previous gfn.
> >
> > It would be better to do a full walk, which sadly will look a lot like
> > p2m_lookup, no avoiding that I think.
> >
> > You can still resume the walk based on next_gfn_to_relinquish and bound
> > it on max_mapped_gfn, although I don't think it is strictly necessary.
> >> +        unsigned long mfn = p >> PAGE_SHIFT;
> >> +
> >> +        if ( mfn_valid(mfn) && p2m_is_foreign(t) )
> >
> > I think it would be worth reiterating in a comment that we only take a
> > ref for foreign mappings right now.
> 
> Will do.
> 
> >> +        {
> >> +            put_page(mfn_to_page(mfn));
> >> +            guest_physmap_remove_page(d, gfn, mfn, 0);
> >
> > You should unmap it and then put it I think.
> >
> > Is this going to do yet another lookup/walk?
> >
> > The REMOVE case of create_p2m_entries is so trivial you could open code
> > it here, or if you wanted to you could refactor it into a helper.
> >
> > I am wondering if the conditional put page ought to be at the point of
> > removal (i.e. in the helper) rather than here. (I think Tim made a
> > similar comment on the x86 version of the remove_from_physmap pvh
> > patches, you probably need to match the generic change which that
> > implies)
> >
> > BTW, if you do the clear (but not the put_page) for every entry then the
> > present bit (or pte.bits == 0) might be a useful proxy for
> > next_lN_entry? I suppose even walking, say, 4GB of mostly empty P2M is
> > not going to be super cheap.
> 
> Following the discution we had IRL, I will try to extend 
> create_p2m_entries by adding RELINQUISH option.

Looking at it again this morning the main change in behaviour will be to
continue on non-present entries instead of either filling them in or
aborting. It'll be interesting to see how this new mode of operation
fits into the function...

> >> +        }
> >> +
> >> +        count++;
> >> +
> >> +        /* Preempt every 2MiB. Arbitrary */
> >> +        if ( (count == 512) && hypercall_preempt_check() )
> >> +        {
> >> +            p2m->next_gfn_to_relinquish = gfn + 1;
> >> +            rc = -EAGAIN;
> >
> > I think I'm just failing to find it, but where is the call to
> > hypercall_create_continuation? I suppose it is somewhere way up the
> > stack?
> 
> Actually rc = -EAGAIN will be catched by xc_domain_destroy in libxc. The 
> function will call again the hypercall if needed.

Ah, I only followed the call chain on the hypervisor side.

> I'm not sure why... do you have any idea why x86 also uses this trick?

I suspect that this operation is so long running that it is beneficial
to return all the way to userspace context in order to allow other
processes to be scheduled. Avoiding the kernel's softlockup timer etc.

A hypercall continuation only give the opportunity for the guest kernel
to process interrupts IIRC.

> > I'm not sure the count == 512 is needed -- hypercall_preempt_check
> > should be sufficient?
> 
> On ARM hypercall_preempt_check is a bit complex. Checking every loop 
> seems a bit overkill.

OK. I see now that existing code has a mixture, every 512 is fine.
Perhaps using LPAE_ENTRIES is a little bit less arbitrary (although it
remains almost completely so...)

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.