[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Claim mode and HVM PoD interact badly



On Mon, Jan 27, 2014 at 4:14 PM, Wei Liu <wei.liu2@xxxxxxxxxx> wrote:
> On Mon, Jan 27, 2014 at 02:54:40PM +0000, George Dunlap wrote:
> [...]
>> > ---8<---
>> > diff --git a/tools/libxc/xc_hvm_build_x86.c 
>> > b/tools/libxc/xc_hvm_build_x86.c
>> > index 77bd365..472f1df 100644
>> > --- a/tools/libxc/xc_hvm_build_x86.c
>> > +++ b/tools/libxc/xc_hvm_build_x86.c
>> > @@ -49,6 +49,8 @@
>> >  #define NR_SPECIAL_PAGES     8
>> >  #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
>> >
>> > +#define POD_VGA_HOLE_SIZE (0x20)
>> > +
>> >  static int modules_init(struct xc_hvm_build_args *args,
>> >                          uint64_t vend, struct elf_binary *elf,
>> >                          uint64_t *mstart_out, uint64_t *mend_out)
>> > @@ -305,11 +307,13 @@ static int setup_guest(xc_interface *xch,
>> >      if ( pod_mode )
>> >      {
>> >          /*
>> > -         * Subtract 0x20 from target_pages for the VGA "hole".  Xen will
>> > -         * adjust the PoD cache size so that domain tot_pages will be
>> > -         * target_pages - 0x20 after this call.
>> > +         * Subtract POD_VGA_HOLE_SIZE from target_pages for the VGA
>> > +         * "hole".  Xen will adjust the PoD cache size so that domain
>> > +         * tot_pages will be target_pages - POD_VGA_HOLE_SIZE after
>> > +         * this call.
>> >           */
>> > -        rc = xc_domain_set_pod_target(xch, dom, target_pages - 0x20,
>> > +        rc = xc_domain_set_pod_target(xch, dom,
>> > +                                      target_pages - POD_VGA_HOLE_SIZE,
>> >                                        NULL, NULL, NULL);
>> >          if ( rc != 0 )
>> >          {
>> > @@ -335,7 +339,12 @@ static int setup_guest(xc_interface *xch,
>> >
>> >      /* try to claim pages for early warning of insufficient memory 
>> > available */
>> >      if ( claim_enabled ) {
>> > -        rc = xc_domain_claim_pages(xch, dom, nr_pages - cur_pages);
>> > +        unsigned long nr = target_pages;
>> > +
>> > +        if ( pod_mode )
>> > +            nr -= POD_VGA_HOLE_SIZE;
>> > +
>> > +        rc = xc_domain_claim_pages(xch, dom, nr);
>>
>> Two things:
>>
>> 1. This is broken because it doesn't claim pages for the PoD "cache".
>> The PoD "cache" amounts to *all the pages that the domain will have
>> allocated* -- there will be basically no pages allocated after this
>> point.
>>
>> Claim mode is trying to make creation of large guests fail early or be
>> guaranteed to succeed.  For large guests, it's set_pod_target() that
>> may take the long time, and it's there that things will fail if
>> there's not enough memory.  By the time you get to actually setting up
>> the p2m, you've already made it.
>>
>> 2. I think the VGA_HOLE doesn't have anything to do with PoD.
>>
>> Actually, it looks like the original code was wrong: correct me if I'm
>> wrong, but xc_domain_claim_pages() wants the *total number of pages*,
>> whereas nr_pages-cur_pages would give you the *number of additional
>> pages*.
>>
>> I think you need to claim nr_pages-VGA_HOLE_SIZE regardless of whether
>> you're in PoD mode or not.  The initial code would claim 0xa0 pages
>> too few; the new code will claim 0x20 pages too many in the non-PoD
>> case.
>>
>> > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
>> > index 5f484a2..1e44ba3 100644
>> > --- a/xen/common/page_alloc.c
>> > +++ b/xen/common/page_alloc.c
>> > @@ -339,8 +339,8 @@ int domain_set_outstanding_pages(struct domain *d, 
>> > unsigned long pages)
>> >          goto out;
>> >      }
>> >
>> > -    /* disallow a claim not exceeding current tot_pages or above 
>> > max_pages */
>> > -    if ( (pages <= d->tot_pages) || (pages > d->max_pages) )
>> > +    /* disallow a claim below current tot_pages or above max_pages */
>> > +    if ( (pages < d->tot_pages) || (pages > d->max_pages) )
>> >      {
>>
>> This seems like a good interface change in any case -- there's no
>> particular reason not to allow multiple calls with the same claim
>> amount.  (Interface-wise, I don't see a good reason we couldn't allow
>> the claim to be reduced as well; but that's probably more than we want
>> to do at this point.)
>>
>> So it seems like we should at least make the two fixes above:
>> * Use nr_pages-VGA_HOLE_SIZE for the claim, regardless of whether PoD is 
>> enabled
>
> You mean target_pages here, right? nr_pages is maxmem= while target_pages
> is memory=. And from the face-to-face discussion we had this morning I
> had the impression you meant target_pages.
>
> In fact using nr_pages won't work because at that point d->max_pages is
> capped to target_memory in the hypervisor.

Dur, yes of course.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.