[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] repeated live migration for VM failed
George, thanks the fixing. With the patch, the testing is running on 90+ time LM without any error till now, let's wait for the final result. Thanks, -Xudong > -----Original Message----- > From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx] > Sent: Monday, May 22, 2017 7:03 PM > To: Hao, Xudong <xudong.hao@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx > Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>; Julien Grall <julien.grall@xxxxxxx>; > Gao, > Chao <chao.gao@xxxxxxxxx>; Paul Durrant <paul.durrant@xxxxxxxxxx>; Andrew > Cooper <andrew.cooper3@xxxxxxxxxx>; Jan Beulich <JBeulich@xxxxxxxx> > Subject: Re: [Xen-devel] [BUG] repeated live migration for VM failed > > On Mon, May 22, 2017 at 11:18 AM, George Dunlap <george.dunlap@xxxxxxxxxx> > wrote: > > On 22/05/17 07:35, Hao, Xudong wrote: > >> Bug detailed description: > >> > >> ---------------- > >> > >> Create one RHEL7.3 HVM and do live migration continuously, while doing the > 200+ or 300+ times live-migration, tool stack report error and migration > failed. > >> > >> > >> > >> Environment : > >> > >> ---------------- > >> > >> HW: Skylake server > >> > >> Xen: Xen 4.9.0 RC4 > >> > >> Dom0: Linux 4.11.0 > >> > >> > >> > >> Reproduce steps: > >> > >> ---------------- > >> > >> 1. Compile Xen 4.9 Rc4 and dom0 kernel 4.11.0, boot to dom0 > >> > >> 2. Boot RHEL7.3 HVM guest > >> > >> 3. Migrate guest to localhost, sleep 10 seconds > >> > >> 4. Repeat doing the step3. > >> > >> > >> > >> Current result: > >> > >> ---------------- > >> > >> VM Migration fail. > >> > >> > >> > >> Base error log: > >> > >> ---------------- > >> > >> xl migrate 24hrs_lm_guest_2 localhost > >> > >> root@localhost's password: > >> > >> migration target: Ready to receive domain. > >> > >> Saving to migration stream new xl format (info 0x3/0x0/1761) > >> > >> Loading new save file <incoming migration stream> (new xl fmt info > >> 0x3/0x0/1761) > >> > >> Savefile contains xl domain config in JSON format > >> > >> Parsing config from <saved> > >> > >> xc: info: Saving domain 273, type x86 HVM > >> > >> xc: info: Found x86 HVM domain from Xen 4.9 > >> > >> xc: info: Restoring domain > >> > >> xc: error: set HVM param 12 = 0x00000000feffe000 (85 = Interrupted > >> system call should ): Internal error > >> > >> xc: error: Restore failed (85 = Interrupted system call should ): > >> Internal error > > > > Interesting -- it appears that setting HVM_PARAM_IDENT_PT (#12) can > > fail with -ERESTART. But the comment for ERESTART makes it explicit > > that it should be internal only -- it should cause a hypercall > > continuation (so that the hypercall restarts automatically), rather > > than returning to the guest. > > > > But the hypercall continuation code seems to have disappeared from > > do_hvm_op() at some point? > > > > /me digs a bit more... > > The problem turns out to be commit ae20ccf ("dm_op: convert > HVMOP_set_mem_type"), which says: > > This patch removes the need for handling HVMOP restarts, so that > infrastructure is removed. > > While it's true that there are no more operations which need iteration > information restored, but there are two operations which may still need to be > restarted to avoid deadlocks with other operations. > > Attached is a patch which restores hypercall continuation checking. > Xudong, can you give it a test? > > Thanks, > -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |