[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] Start PV guest faster



On 20/05/14 08:26, Frediano Ziglio wrote:
> Experimental patch that try to allocate large chunks in order to start
> PV guest quickly.
>
> Signed-off-by: Frediano Ziglio <frediano.ziglio@xxxxxxxxxx>
> ---
>  tools/libxc/xc_dom_x86.c | 51 
> ++++++++++++++++++++++++++++++------------------
>  1 file changed, 32 insertions(+), 19 deletions(-)
>
>
> It's a while I noticed that the time to start a large PV guest depends
> on the amount of memory. For VMs with 64 or more GB of ram the time can
> become quite significant (like 20 seconds). Digging around I found that
> a lot of time is spend populating RAM (from a single hypercall made by
> xenguest).
>
> xenguest allocate the memory asking for single pages in a single
> hypercall. This patch try to use larger chunks of memory. Note that the
> order parameter populating pages has nothing to do in this case with
> superpages but just with allocation pages.

Here, you probably mean 'domain builder', which is a component of libxc.
`xenguest` is a XenServer specific thing which invokes the domain
builder on behalf of Xapi.

>
> The improvement is quite significant (the hypercall is more than 20
> times faster for a machine with 3GB) however there are different things
> to consider:
> - should this optimization be done inside Xen? If the change is just
> userspace surely this make Xen simpler and safer but on the other way
> Xen is more aware if is better to allocate big chunks or not

No - the whole reason for having the order field in the first place is
to allow userspace to batch like this.  Xen cannot guess at what
userspace is likely to ask for in the future.

> - can userspace request some memory statistics to Xen in order to make
> better use of chunks?
> - how is affected memory fragmentation? Original code request single
> pages so Xen can decide to fill the gaps and keep large order of pages
> available for HVM guests (which can use superpages). On the other way if
> I want to execute 2 PV guest with 60GB each on a 128GB host I don't see
> the point to not request large chunks.

Inside xen, the order is broken down into individual pages inside
guest_physmap_add_entry().

The net improvement you are seeing is probably from not taking and
releasing the p2m lock for every single page.

~Andrew

> - debug Xen return pages in reverse order while the chunks have to be
> allocated sequentially. Is this a problem?
>
> I didn't find any piece of code where superpages is turned on in
> xc_dom_image but I think that if the number of pages is not multiple of
> superpages the code allocate a bit less memory for the guest.
>
>
> diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
> index e034d62..d09269a 100644
> --- a/tools/libxc/xc_dom_x86.c
> +++ b/tools/libxc/xc_dom_x86.c
> @@ -756,10 +756,34 @@ static int x86_shadow(xc_interface *xch, domid_t domid)
>      return rc;
>  }
>  
> +static int populate_range(struct xc_dom_image *dom, xen_pfn_t extents[],
> +                          xen_pfn_t pfn_start, unsigned page_order,
> +                          xen_pfn_t num_pages)
> +{
> +    int rc;
> +    xen_pfn_t pfn, mask;
> +
> +    for ( pfn = 0; pfn < num_pages; pfn += 1 << page_order )
> +        extents[pfn >> page_order] = pfn + pfn_start;
> +
> +    rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
> +                                           pfn >> page_order, page_order, 0,
> +                                           extents);
> +    if ( rc || page_order == 0 )
> +        return rc;
> +
> +    /* convert to "normal" pages */
> +    mask = (1ULL << page_order) - 1;
> +    for ( pfn = num_pages; pfn-- > 0; )
> +        extents[pfn] = extents[pfn >> page_order] + (pfn & mask);
> +
> +    return rc;
> +}
> +
>  int arch_setup_meminit(struct xc_dom_image *dom)
>  {
>      int rc;
> -    xen_pfn_t pfn, allocsz, i, j, mfn;
> +    xen_pfn_t pfn, allocsz, i;
>  
>      rc = x86_compat(dom->xch, dom->guest_domid, dom->guest_type);
>      if ( rc )
> @@ -779,25 +803,12 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>      if ( dom->superpages )
>      {
>          int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
> -        xen_pfn_t extents[count];
>  
>          DOMPRINTF("Populating memory with %d superpages", count);
> -        for ( pfn = 0; pfn < count; pfn++ )
> -            extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
> -        rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
> -                                               count, SUPERPAGE_PFN_SHIFT, 0,
> -                                               extents);
> +        rc = populate_range(dom, dom->p2m_host, 0, SUPERPAGE_PFN_SHIFT,
> +                            dom->total_pages);
>          if ( rc )
>              return rc;
> -
> -        /* Expand the returned mfn into the p2m array */
> -        pfn = 0;
> -        for ( i = 0; i < count; i++ )
> -        {
> -            mfn = extents[i];
> -            for ( j = 0; j < SUPERPAGE_NR_PFNS; j++, pfn++ )
> -                dom->p2m_host[pfn] = mfn + j;
> -        }
>      }
>      else
>      {
> @@ -820,9 +831,11 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>              allocsz = dom->total_pages - i;
>              if ( allocsz > 1024*1024 )
>                  allocsz = 1024*1024;
> -            rc = xc_domain_populate_physmap_exact(
> -                dom->xch, dom->guest_domid, allocsz,
> -                0, 0, &dom->p2m_host[i]);
> +            /* try bit chunk of memory first */
> +            if ( (allocsz & ((1<<10)-1)) == 0 )
> +                rc = populate_range(dom, &dom->p2m_host[i], i, 10, allocsz);
> +            if ( rc )
> +                rc = populate_range(dom, &dom->p2m_host[i], i, 0, allocsz);
>          }
>  
>          /* Ensure no unclaimed pages are left unused.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.