|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 00/18] libxl: domain save/restore: run in a separate process
On Fri, 2012-06-08 at 18:34 +0100, Ian Jackson wrote:
> This is v3 of my series to asyncify save/restore, rebased to current
> tip, retested, and with all comments addressed.
There's quite a lot of combinations which need testing here (PV, HVM,
HVM w/ stub dm, old vs new qemu etc etc), which of those have you tried?
I tried a simple localhost migrate of a PV guest and:
# xl -vvv migrate d32-1 localhost
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/3541)
libxl: debug: libxl.c:722:libxl_domain_suspend: ao 0x8069720: create:
how=(nil) callback=(nil) poller=0x80696c8
Loading new save file <incoming migration stream> (new xl fmt info
0x0/0x0/3541)
Savefile contains xl domain config
libxl: debug: libxl_dom.c:969:libxl__toolstack_save: domain=2 toolstack
data size=8
libxl: debug: libxl.c:745:libxl_domain_suspend: ao 0x8069720:
inprogress: poller=0x80696c8, flags=i
libxl-save-helper: debug: starting save: Success
xc: detail: Had 0 unexplained entries in p2m table
xc: Saving memory: iter 0 (last sent 0 skipped 0): 0/131072 0%
at which point it appears to just stop.
# strace -p 2872 # /usr/lib/xen/bin/libxl-save-helper --save-domain 8 2
0 0 1 0 0 12 8 72
Process 2872 attached - interrupt to quit
write(8, 0xb5d31000, 1974272^C <unfinished ...>
Process 2872 detached
# strace -p 2866 # /usr/lib/xen/bin/libxl-save-helper --restore-domain
0 3 1 0 2 0 0 1 0 0 0
Process 2866 attached - interrupt to quit
read(0, ^C <unfinished ...>
# strace -p 4070 # xl -vvv migrate d32-1 localhost
Process 4070 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>
# strace -p 4074 # xl migrate-receive
Process 4074 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>
So the saver seems to be blocked writing to fd 8, which is argv[1] == io_fd.
Also FWIW:
# xl list
Name ID Mem VCPUs State
Time(s)
Domain-0 0 511 4 r-----
24.5
d32-1 2 128 4 -b----
0.4
d32-1--incoming 3 0 0 --p---
0.0
/var/log/xen/xl-d32-1.log is just "Waiting for domain d32-1 (domid 9) to
die [pid 4045]" (nb: this was a newer attempt than the ones above, to be
sure I was looking at the right log, so the domid's don't match, 9 ==
d32-1 not the incoming one). There is no xl log for the incoming domain.
Also it'd be worth pinging/CCing Shriram next time to get him to sanity
test the Remus cases too.
I'm in the middle of reviewing #5/19 (the meat), I'll keep going
although I doubt I'll spot the cause of this...
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |