[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] repeated live migration for VM failed
George, The live migrate pass over 500+ times with this patch, I think it's fine to merge it into Xen 4.9. Tested-by: Xudong Hao <xudong.hao@xxxxxxxxx> Thanks, -Xudong > -----Original Message----- > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of Hao, > Xudong > Sent: Tuesday, May 23, 2017 5:23 PM > To: George Dunlap <george.dunlap@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxx > Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>; Andrew Cooper > <andrew.cooper3@xxxxxxxxxx>; Julien Grall <julien.grall@xxxxxxx>; Paul > Durrant <paul.durrant@xxxxxxxxxx>; Jan Beulich <JBeulich@xxxxxxxx>; Gao, > Chao <chao.gao@xxxxxxxxx> > Subject: Re: [Xen-devel] [BUG] repeated live migration for VM failed > > George, thanks the fixing. > With the patch, the testing is running on 90+ time LM without any error till > now, > let's wait for the final result. > > Thanks, > -Xudong > > > > -----Original Message----- > > From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx] > > Sent: Monday, May 22, 2017 7:03 PM > > To: Hao, Xudong <xudong.hao@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx > > Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>; Julien Grall > > <julien.grall@xxxxxxx>; Gao, Chao <chao.gao@xxxxxxxxx>; Paul Durrant > > <paul.durrant@xxxxxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; > > Jan Beulich <JBeulich@xxxxxxxx> > > Subject: Re: [Xen-devel] [BUG] repeated live migration for VM failed > > > > On Mon, May 22, 2017 at 11:18 AM, George Dunlap > > <george.dunlap@xxxxxxxxxx> > > wrote: > > > On 22/05/17 07:35, Hao, Xudong wrote: > > >> Bug detailed description: > > >> > > >> ---------------- > > >> > > >> Create one RHEL7.3 HVM and do live migration continuously, while > > >> doing the > > 200+ or 300+ times live-migration, tool stack report error and migration > > failed. > > >> > > >> > > >> > > >> Environment : > > >> > > >> ---------------- > > >> > > >> HW: Skylake server > > >> > > >> Xen: Xen 4.9.0 RC4 > > >> > > >> Dom0: Linux 4.11.0 > > >> > > >> > > >> > > >> Reproduce steps: > > >> > > >> ---------------- > > >> > > >> 1. Compile Xen 4.9 Rc4 and dom0 kernel 4.11.0, boot to dom0 > > >> > > >> 2. Boot RHEL7.3 HVM guest > > >> > > >> 3. Migrate guest to localhost, sleep 10 seconds > > >> > > >> 4. Repeat doing the step3. > > >> > > >> > > >> > > >> Current result: > > >> > > >> ---------------- > > >> > > >> VM Migration fail. > > >> > > >> > > >> > > >> Base error log: > > >> > > >> ---------------- > > >> > > >> xl migrate 24hrs_lm_guest_2 localhost > > >> > > >> root@localhost's password: > > >> > > >> migration target: Ready to receive domain. > > >> > > >> Saving to migration stream new xl format (info 0x3/0x0/1761) > > >> > > >> Loading new save file <incoming migration stream> (new xl fmt info > > >> 0x3/0x0/1761) > > >> > > >> Savefile contains xl domain config in JSON format > > >> > > >> Parsing config from <saved> > > >> > > >> xc: info: Saving domain 273, type x86 HVM > > >> > > >> xc: info: Found x86 HVM domain from Xen 4.9 > > >> > > >> xc: info: Restoring domain > > >> > > >> xc: error: set HVM param 12 = 0x00000000feffe000 (85 = Interrupted > > >> system call should ): Internal error > > >> > > >> xc: error: Restore failed (85 = Interrupted system call should ): > > >> Internal error > > > > > > Interesting -- it appears that setting HVM_PARAM_IDENT_PT (#12) can > > > fail with -ERESTART. But the comment for ERESTART makes it explicit > > > that it should be internal only -- it should cause a hypercall > > > continuation (so that the hypercall restarts automatically), rather > > > than returning to the guest. > > > > > > But the hypercall continuation code seems to have disappeared from > > > do_hvm_op() at some point? > > > > > > /me digs a bit more... > > > > The problem turns out to be commit ae20ccf ("dm_op: convert > > HVMOP_set_mem_type"), which says: > > > > This patch removes the need for handling HVMOP restarts, so that > > infrastructure is removed. > > > > While it's true that there are no more operations which need iteration > > information restored, but there are two operations which may still > > need to be restarted to avoid deadlocks with other operations. > > > > Attached is a patch which restores hypercall continuation checking. > > Xudong, can you give it a test? > > > > Thanks, > > -George > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > https://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |