[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/1] xc_domain_restore: Allow QEMU to increase memory





On 04/14/2015 04:53 PM, Andrew Cooper wrote:
On 14/04/15 00:51, Don Slutz wrote:
On 04/13/15 12:25, Andrew Cooper wrote:
On 13/04/15 17:09, Don Slutz wrote:
If QEMU has called on xc_domain_setmaxmem to add more memory for
option ROMs, domain restore needs to also increase the memory.

Signed-off-by: Don Slutz <dslutz@xxxxxxxxxxx>
hvmloader has no interaction with xc_domain_restore().
I did not mean to imply that it did.  Somehow I lost the fact that this
is a bug in master:

[root@dcs-xen-52 don]# xl cre -p /home/don/e1000x8.xfg
Parsing config from /home/don/e1000x8.xfg
got a tsc mode string: "native_paravirt"
[root@dcs-xen-52 don]# xl save e1000x8 e1000x8.save
Saving to e1000x8.save new xl format (info 0x1/0x0/3506)
xc: Saving memory: iter 0 (last sent 0 skipped 0): 1044481/1044481  100%
[root@dcs-xen-52 don]# xl restore e1000x8.save
Loading new save file e1000x8.save (new xl fmt info 0x1/0x0/3506)
  Savefile contains xl domain config in JSON format
Parsing config from <saved>
xc: error: Failed to allocate memory for batch.!: Internal error
libxl: error: libxl_create.c:1057:libxl__xc_domain_restore_done:
restoring domain: Cannot allocate memory
libxl: error: libxl_create.c:1129:domcreate_rebuild_done: cannot
(re-)build domain: -3
libxl: error: libxl.c:1576:libxl__destroy_domid: non-existant domain 2
libxl: error: libxl.c:1540:domain_destroy_callback: unable to destroy
guest with domid 2
libxl: error: libxl_create.c:1490:domcreate_destruction_cb: unable to
destroy domain 2 following failed creation
[root@dcs-xen-52 don]#

The hvmloader part turns out to be a "warning message" that is ok to
ignore.  However "xl save" followed by "xl restore" is currently broken
without some fix.

It is xl's job to propagate the current memory as part of migration.  If
qemu has had to bump it up, this should be reflected in the domain
config file.
Not at all sure how QEMU (either in dom0 or a driver domain) is expected
to change a file (the domain config file).  My guess is that you are
saying that "xl save" (before xc_domain_save) is the correct place to
record (or add extra info).   However it looks to me that all the data
needed is in the save file, but
I could be wrong.

The page data is correctly saved into the save file (migration stream).
However when
the new domain is created, it's size is "wrong".  You cannot adjust any
of the config info to fix the size, because the option ROM space to
reserve is not an existing config option.   So if I am following you
correctly, libxl needs to add and process more info to handle this case.

The migration path should not be hacked up to fix a higher level
toolstack bug.
I do not see how QEMU is part of the "higher level toolstack".  If you
mean xl vs xc, then
I can see "xl save" some how doing the work.  This patch works for me,
but I am happy to
explore other ways to fix this bug.

If I understand correctly, the steps are this:

* 'xl create' makes a VM of size $FOO
* qemu bumps the size to $FOO+$N
* 'xl save' writes $FOO+$N of page data, but the xl config file at the
start of the image still says $FOO
* 'xl restore' creates a vm of size $FOO, then instructs
xc_domain_restore() to put $FOO+$N pages into it.

I would argue first, that qemu should not play in this area to start with.

However, the real bug here is that the domain configuration written by
xl save is inaccurate.

There's a case like COLO:
1. Both Primary/Secondary VM are created first with the same config file
   which makes a VM of size $FOO
2. qemu bumps the size to $FOO+$N
3. 'save' writes $FOO+$N of page data
4. 'restore' put $FOO+$N pages into $FOO pages which will cause error

Even if you fix the configuration, the bug still exists because the config
only been transferred from Primary to Secondary once at the very beginning
when you execute 'xl remus' command. The problem is how to let the secondary
(restore) side knows the size change? Through a migration command(which is
easier in v2 migration) or some other solution?
Form this point of view, I think Don's solution is one way to solve the
problem.


~Andrew
.


--
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.