Xen project Mailing List

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

On Mon, 1 Dec 2014, Don Slutz wrote: > On 12/01/14 10:37, Stefano Stabellini wrote: > > On Mon, 1 Dec 2014, Don Slutz wrote: > > > On 11/27/14 05:48, Stefano Stabellini wrote: > > [...] > > > > > > > Works fine in both claim modes and with PoD used (maxmem > memory). Do > > > not know how to test with tmem. I do not see how it would be worse then > > > current > > > code that does not auto increase. I.E. even without a xen change, I think > > > something > > > like this could be done. > > OK, good to know. I am OK with increasing maxmem only if it is strictly > > necessary. > > > > > > > > > My testing shows a free 32 pages that I am not sure where they come > > > > > from. > > > > > But > > > > > the code about is passing my 8 nics of e1000. > > > > I think that raising maxmem a bit higher than necessary is not too bad. > > > > If we really care about it, we could lower the limit after QEMU's > > > > initialization is completed. > > > Ok. I did find the 32 it is VGA_HOLE_SIZE. So here is what I have which > > > includes > > > a lot of extra printf. > > In QEMU I would prefer not to assume that libxl increased maxmem for the > > vga hole. I would rather call xc_domain_setmaxmem twice for the vga hole > > than tie QEMU to a particular maxmem allocation scheme in libxl. > > Ok. The area we are talking about is 0x000a0000 to 0x000c0000. > It is in libxc (xc_hvm_build_x86), not libxl. I have no issue with a name > change to > some thing like QEMU_EXTRA_FREE_PAGES. Sorry, I was thinking about the relocated videoram region, the one allocated by xen_add_to_physmap in QEMU. Regarding 0x000a0000-0x000c0000, according to this comment: /* * Allocate memory for HVM guest, skipping VGA hole 0xA0000-0xC0000. * xc_hvm_build_x86 shouldn't be allocating memory for it. In fact if I remember correctly 0xa0000-0xc0000 is trapped and emulated. > My testing has shows that some of these 32 pages are used outside of QEMU. > I am seeing just 23 free pages using a standalone program to display > the same info after a CentOS 6.3 guest is done booting. > > > In libxl I would like to avoid increasing mamxem for anything QEMU will > > allocate later, that includes rom and vga vram. I am not sure how to > > make that work with older QEMU versions that don't call > > xc_domain_setmaxmem by themselves yet though. Maybe we could check the > > specific QEMU version in libxl and decide based on that. Or we could > > export a feature flag in QEMU. > > Yes, it would be nice to adjust libxl to not increase maxmem. However since > videoram is included in memory (and maxmem), making the change related > to vram is a bigger issue. > > the rom change is much simpler. > > Currently I do not know of a way to do different things based on the QEMU > version > and/or features (this includes getting the QEMU version in libxl). > > I have been going with: > 1) change QEMU 1st. > 2) Wait for an upstream version of QEMU with this. > 3) change xen to optionally use a feature in the latest QEMU. Let's try to take this approach for the videoram too > > > > > --- a/xen-hvm.c > > > +++ b/xen-hvm.c > > > @@ -67,6 +67,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t > > > *shared_page, int vcpu) > > > #endif > > > > > > #define BUFFER_IO_MAX_DELAY 100 > > > +#define VGA_HOLE_SIZE (0x20) > > > > > > typedef struct XenPhysmap { > > > hwaddr start_addr; > > > @@ -219,6 +220,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t > > > size, > > > MemoryRegion *mr) > > > xen_pfn_t *pfn_list; > > > int i; > > > xc_dominfo_t info; > > > + unsigned long max_pages, free_pages, real_free; > > > + long need_pages; > > > + uint64_t tot_pages, pod_cache_pages, pod_entries; > > > + > > > + trace_xen_ram_alloc(ram_addr, size, mr->name); > > > > > > if (runstate_check(RUN_STATE_INMIGRATE)) { > > > /* RAM already populated in Xen */ > > > @@ -232,13 +238,6 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t > > > size, > > > MemoryRegion *mr) > > > return; > > > } > > > > > > - fprintf(stderr, "%s: alloc "RAM_ADDR_FMT > > > - " bytes (%ld Kib) of ram at "RAM_ADDR_FMT > > > - " mr.name=%s\n", > > > - __func__, size, (long)(size>>10), ram_addr, mr->name); > > > - > > > - trace_xen_ram_alloc(ram_addr, size); > > > - > > > nr_pfn = size >> TARGET_PAGE_BITS; > > > pfn_list = g_malloc(sizeof (*pfn_list) * nr_pfn); > > > > > > @@ -246,11 +245,38 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t > > > size, > > > MemoryRegion *mr) > > > pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i; > > > } > > > > > > - if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) { > > > + if ((xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) || I think it's best to call xc_domain_getinfolist. > > > + (info.domid != xen_domid)) { > > > hw_error("xc_domain_getinfo failed"); > > > } > > > - if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb + > > > - (nr_pfn * XC_PAGE_SIZE / 1024)) < 0) { > > > + max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE; > > > + free_pages = max_pages - info.nr_pages; > > > + real_free = free_pages; > > > + if (free_pages > VGA_HOLE_SIZE) { > > > + free_pages -= VGA_HOLE_SIZE; > > > + } else { > > > + free_pages = 0; > > > + } I don't think we need to subtract VGA_HOLE_SIZE. > > > + need_pages = nr_pfn - free_pages; > > > + fprintf(stderr, "%s: alloc "RAM_ADDR_FMT > > > + " bytes (%ld KiB) of ram at "RAM_ADDR_FMT > > > + " mp=%ld fp=%ld nr=%ld need=%ld mr.name=%s\n", > > > + __func__, size, (long)(size>>10), ram_addr, > > > + max_pages, free_pages, nr_pfn, need_pages, > > > + mr->name); > > > + if (xc_domain_get_pod_target(xen_xc, xen_domid, &tot_pages, > > > + &pod_cache_pages, &pod_entries) >= 0) { > > > + unsigned long populated = tot_pages - pod_cache_pages; > > > + long delta_tot = tot_pages - info.nr_pages; > > > + > > > + fprintf(stderr, "%s: PoD pop=%ld tot=%ld(%ld) cnt=%ld ent=%ld > > > nop=%ld > > > free=%ld\n", > > > + __func__, populated, (long)tot_pages, delta_tot, > > > + (long)pod_cache_pages, (long)pod_entries, > > > + (long)info.nr_outstanding_pages, real_free); > > > + } > > What is the purpose of calling xc_domain_get_pod_target here? It doesn't > > look like is affecting the parameters passed to xc_domain_setmaxmem. > > This was just to test and see if anything was needed for PoD mode. > I did not see anything. > > Did you want me to make a patch to send to the list that has my proposed > changes? Yep, that's fine. > I do think that what I have would be fine to do since most of the time the > new code does nothing with the current xen (and older versions), until > more then 4 nics are configured for xen. > > It would be good to have the change: > > [PATCH] libxl_set_memory_target: retain the same maxmem offset on top of the > current target > > (When working) in xen before using a QEMU with this change. > > Not sure I would push for 2.2 however. I wouldn't. > -Don Slutz > > > > > > + if ((free_pages < nr_pfn) && > > > + (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb + > > > + (need_pages * XC_PAGE_SIZE / 1024)) < 0)) { > > > hw_error("xc_domain_setmaxmem failed"); > > > } > > > if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, > > > 0, > > > pfn_list)) { > > > > > > > > > Note: I already had a trace_xen_ram_alloc, here is the delta in > > > trace-events > > > (which > > > has others not yet sent): > > > > > > > > > > > > diff --git a/trace-events b/trace-events > > > index eaeecd5..a58fc11 100644 > > > --- a/trace-events > > > +++ b/trace-events > > > @@ -908,7 +908,7 @@ pvscsi_tx_rings_ppn(const char* label, uint64_t ppn) > > > "%s > > > page: %"PRIx64"" > > > pvscsi_tx_rings_num_pages(const char* label, uint32_t num) "Number of %s > > > pages: %u" > > > > > > # xen-all.c > > > -xen_ram_alloc(unsigned long ram_addr, unsigned long size) "requested: > > > %#lx, > > > size %#lx" > > > +xen_ram_alloc(unsigned long ram_addr, unsigned long size, const char* > > > mr_name) "requested: %#lx size %#lx mr->name=%s" > > > xen_client_set_memory(uint64_t start_addr, unsigned long size, bool > > > log_dirty) "%#"PRIx64" size %#lx, log_dirty %i" > > > handle_ioreq(void *req, uint32_t type, uint32_t dir, uint32_t df, > > > uint32_t > > > data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size) > > > "I/O=%p type=%d dir=%d df=%d ptr=%d port=%#"PRIx64" data=%#"PRIx64" > > > count=%d > > > size=%d" > > > handle_ioreq_read(void *req, uint32_t type, uint32_t df, uint32_t > > > data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size) > > > "I/O=%p read type=%d df=%d ptr=%d port=%#"PRIx64" data=%#"PRIx64" count=%d > > > size=%d" > > > > > > > > > -Don Slutz > > > > > > > > -Don Slutz > > > > > > > > > > > > > > > > > > + hw_error("xc_domain_setmaxmem failed"); > > > > > > > > + } > > > > > > > > if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, > > > > > > > > nr_pfn, 0, > > > > > > > > 0, pfn_list)) { > > > > > > > > hw_error("xen: failed to populate ram at " > > > > > > > > RAM_ADDR_FMT, > > > > > > > > ram_addr); > > > > > > > > } > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > Xen-devel mailing list > > > > > > > > Xen-devel@xxxxxxxxxxxxx > > > > > > > > http://lists.xen.org/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.