Xen project Mailing List

Re: [Xen-devel] [RFC PATCH] Start PV guest faster

To: "Frediano Ziglio" <frediano.ziglio@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Tue, 20 May 2014 10:30:44 +0100

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

Delivery-date: Tue, 20 May 2014 09:31:00 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 20.05.14 at 09:26, <frediano.ziglio@xxxxxxxxxx> wrote: > Experimental patch that try to allocate large chunks in order to start > PV guest quickly. The fundamental idea is certainly welcome. > It's a while I noticed that the time to start a large PV guest depends > on the amount of memory. For VMs with 64 or more GB of ram the time can > become quite significant (like 20 seconds). Digging around I found that > a lot of time is spend populating RAM (from a single hypercall made by > xenguest). Did you check whether - like noticed elsewhere - this is due to excessive hypercall preemption/restart? I.e. whether making the preemption checks less fine grained helps? > The improvement is quite significant (the hypercall is more than 20 > times faster for a machine with 3GB) however there are different things > to consider: > - should this optimization be done inside Xen? If the change is just > userspace surely this make Xen simpler and safer but on the other way > Xen is more aware if is better to allocate big chunks or not Except that Xen has no way to tell what "better" here would be. > - debug Xen return pages in reverse order while the chunks have to be > allocated sequentially. Is this a problem? I think the ability to populate guest memory with (largely, but not necessarily entirely) discontiguous memory should be retained for debugging purposes (see also below). > I didn't find any piece of code where superpages is turned on in > xc_dom_image but I think that if the number of pages is not multiple of > superpages the code allocate a bit less memory for the guest. I think that's expected - I wonder whether that code is really in use by anyone... > @@ -820,9 +831,11 @@ int arch_setup_meminit(struct xc_dom_image *dom) > allocsz = dom->total_pages - i; > if ( allocsz > 1024*1024 ) > allocsz = 1024*1024; > - rc = xc_domain_populate_physmap_exact( > - dom->xch, dom->guest_domid, allocsz, > - 0, 0, &dom->p2m_host[i]); > + /* try bit chunk of memory first */ > + if ( (allocsz & ((1<<10)-1)) == 0 ) > + rc = populate_range(dom, &dom->p2m_host[i], i, 10, allocsz); > + if ( rc ) > + rc = populate_range(dom, &dom->p2m_host[i], i, 0, allocsz); So on what basis was 10 chosen here? I wonder whether this shouldn't be (a) smaller by default, (b) configurable (globally or even per guest), (c) dependent on the total memory getting assigned to the guest, (d) tried with sequentially decreasing order after failure. Additionally you're certainly aware that allocation failures lead to hypervisor log messages (as today already seen when HVM guests can't have their order-18 or order-9 allocations fulfilled). We may need to think about ways to suppress these messages for such allocations where the caller intends to retry with a smaller order. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.