[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

On 12/01/14 10:37, Stefano Stabellini wrote:
On Mon, 1 Dec 2014, Don Slutz wrote:
On 11/27/14 05:48, Stefano Stabellini wrote:


Works fine in both claim modes and with PoD used (maxmem > memory).  Do
not know how to test with tmem.  I do not see how it would be worse then
code that does not auto increase.  I.E. even without a xen change, I think
like this could be done.
OK, good to know. I am OK with increasing maxmem only if it is strictly

My testing shows a free 32 pages that I am not sure where they come from.
the code about is passing my 8 nics of e1000.
I think that raising maxmem a bit higher than necessary is not too bad.
If we really care about it, we could lower the limit after QEMU's
initialization is completed.
Ok.  I did find the 32 it is VGA_HOLE_SIZE.  So here is what I have which
a lot of extra printf.
In QEMU I would prefer not to assume that libxl increased maxmem for the
vga hole. I would rather call xc_domain_setmaxmem twice for the vga hole
than tie QEMU to a particular maxmem allocation scheme in libxl.

Ok.  The area we are talking about is 0x000a0000 to 0x000c0000.
It is in libxc (xc_hvm_build_x86), not libxl. I have no issue with a name change to
some thing like QEMU_EXTRA_FREE_PAGES.

My testing has shows that some of these 32 pages are used outside of QEMU.
I am seeing just 23 free pages using a standalone program to display
the same info after a CentOS 6.3 guest is done booting.

In libxl I would like to avoid increasing mamxem for anything QEMU will
allocate later, that includes rom and vga vram. I am not sure how to
make that work with older QEMU versions that don't call
xc_domain_setmaxmem by themselves yet though. Maybe we could check the
specific QEMU version in libxl and decide based on that. Or we could
export a feature flag in QEMU.

Yes, it would be nice to adjust libxl to not increase maxmem. However since
videoram is included in memory (and maxmem), making the change related
to vram is a bigger issue.

the rom change is much simpler.

Currently I do not know of a way to do different things based on the QEMU version
and/or features (this includes getting the QEMU version in libxl).

I have been going with:
1) change QEMU 1st.
2) Wait for an upstream version of QEMU with this.
3) change xen to optionally use a feature in the latest QEMU.

--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -67,6 +67,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t
*shared_page, int vcpu)

  #define BUFFER_IO_MAX_DELAY  100
+#define VGA_HOLE_SIZE (0x20)

  typedef struct XenPhysmap {
      hwaddr start_addr;
@@ -219,6 +220,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
MemoryRegion *mr)
      xen_pfn_t *pfn_list;
      int i;
      xc_dominfo_t info;
+    unsigned long max_pages, free_pages, real_free;
+    long need_pages;
+    uint64_t tot_pages, pod_cache_pages, pod_entries;
+    trace_xen_ram_alloc(ram_addr, size, mr->name);

      if (runstate_check(RUN_STATE_INMIGRATE)) {
          /* RAM already populated in Xen */
@@ -232,13 +238,6 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
MemoryRegion *mr)

-    fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
-            " bytes (%ld Kib) of ram at "RAM_ADDR_FMT
-            " mr.name=%s\n",
-            __func__, size, (long)(size>>10), ram_addr, mr->name);
-    trace_xen_ram_alloc(ram_addr, size);
      nr_pfn = size >> TARGET_PAGE_BITS;
      pfn_list = g_malloc(sizeof (*pfn_list) * nr_pfn);

@@ -246,11 +245,38 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
MemoryRegion *mr)
          pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;

-    if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {
+    if ((xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) ||
+       (info.domid != xen_domid)) {
          hw_error("xc_domain_getinfo failed");
-    if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
-                            (nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
+    max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
+    free_pages = max_pages - info.nr_pages;
+    real_free = free_pages;
+    if (free_pages > VGA_HOLE_SIZE) {
+        free_pages -= VGA_HOLE_SIZE;
+    } else {
+        free_pages = 0;
+    }
+    need_pages = nr_pfn - free_pages;
+    fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
+            " bytes (%ld KiB) of ram at "RAM_ADDR_FMT
+            " mp=%ld fp=%ld nr=%ld need=%ld mr.name=%s\n",
+            __func__, size, (long)(size>>10), ram_addr,
+           max_pages, free_pages, nr_pfn, need_pages,
+            mr->name);
+    if (xc_domain_get_pod_target(xen_xc, xen_domid, &tot_pages,
+                                 &pod_cache_pages, &pod_entries) >= 0) {
+        unsigned long populated = tot_pages - pod_cache_pages;
+        long delta_tot = tot_pages - info.nr_pages;
+        fprintf(stderr, "%s: PoD pop=%ld tot=%ld(%ld) cnt=%ld ent=%ld nop=%ld
+                __func__, populated, (long)tot_pages, delta_tot,
+                (long)pod_cache_pages, (long)pod_entries,
+               (long)info.nr_outstanding_pages, real_free);
+    }
What is the purpose of calling xc_domain_get_pod_target here? It doesn't
look like is affecting the parameters passed to xc_domain_setmaxmem.

This was just to test and see if anything was needed for PoD mode.
I did not see anything.

Did you want me to make a patch to send to the list that has my proposed

I do think that what I have would be fine to do since most of the time the
new code does nothing with the current xen (and older versions), until
more then 4 nics are configured for xen.

It would be good to have the change:

[PATCH] libxl_set_memory_target: retain the same maxmem offset on top of the current target

(When working) in xen before using a QEMU with this change.

Not sure I would push for 2.2 however.

   -Don Slutz

+    if ((free_pages < nr_pfn) &&
+       (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
+                            (need_pages * XC_PAGE_SIZE / 1024)) < 0)) {
          hw_error("xc_domain_setmaxmem failed");
      if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, 0,
pfn_list)) {

Note: I already had a trace_xen_ram_alloc, here is the delta in trace-events
has others not yet sent):

diff --git a/trace-events b/trace-events
index eaeecd5..a58fc11 100644
--- a/trace-events
+++ b/trace-events
@@ -908,7 +908,7 @@ pvscsi_tx_rings_ppn(const char* label, uint64_t ppn) "%s
page: %"PRIx64""
  pvscsi_tx_rings_num_pages(const char* label, uint32_t num) "Number of %s
pages: %u"

  # xen-all.c
-xen_ram_alloc(unsigned long ram_addr, unsigned long size) "requested: %#lx,
size %#lx"
+xen_ram_alloc(unsigned long ram_addr, unsigned long size, const char*
mr_name) "requested: %#lx size %#lx mr->name=%s"
  xen_client_set_memory(uint64_t start_addr, unsigned long size, bool
log_dirty) "%#"PRIx64" size %#lx, log_dirty %i"
  handle_ioreq(void *req, uint32_t type, uint32_t dir, uint32_t df, uint32_t
data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size)
"I/O=%p type=%d dir=%d df=%d ptr=%d port=%#"PRIx64" data=%#"PRIx64" count=%d
  handle_ioreq_read(void *req, uint32_t type, uint32_t df, uint32_t
data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size)
"I/O=%p read type=%d df=%d ptr=%d port=%#"PRIx64" data=%#"PRIx64" count=%d

    -Don Slutz

     -Don Slutz

+        hw_error("xc_domain_setmaxmem failed");
+    }
        if (xc_domain_populate_physmap_exact(xen_xc, xen_domid,
nr_pfn, 0,
0, pfn_list)) {
            hw_error("xen: failed to populate ram at " RAM_ADDR_FMT,

Xen-devel mailing list

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.