[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] pv guests die after failed migration
On Fri, 2011-09-23 at 08:39 +0100, Andreas Olsowski wrote: > Here is the full procedure: [...] Thanks, I should be able to figure out a repro with this, although I may not get to it straight away. > root@xenturio1:/usr/src/linux-2.6-xen# xl console thiswillfail > PM: freeze of devices complete after 0.207 msecs > PM: late freeze of devices complete after 0.058 msecs > ------------[ cut here ]------------ > kernel BUG at drivers/xen/events.c:1466! > invalid opcode: 0000 [#1] SMP > CPU 0 > Modules linked in: > > Pid: 6, comm: migration/0 Not tainted 3.0.4-xenU #6 > RIP: e030:[<ffffffff8140d574>] [<ffffffff8140d574>] > xen_irq_resume+0x224/0x370 > RSP: e02b:ffff88001f9fbce0 EFLAGS: 00010082 > RAX: ffffffffffffffef RBX: 0000000000000000 RCX: 0000000000000000 > RDX: ffff88001f809ea8 RSI: ffff88001f9fbd00 RDI: 0000000000000001 > RBP: 0000000000000010 R08: ffffffff81859a00 R09: 0000000000000000 > R10: 0000000000000000 R11: 09f911029d74e35b R12: 0000000000000000 > R13: 000000000000f0a0 R14: 0000000000000000 R15: ffff88001f9fbd00 > FS: 00007ff28f8c8700(0000) GS:ffff88001fec6000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007fff02056048 CR3: 000000001e4d8000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process migration/0 (pid: 6, threadinfo ffff88001f9fa000, task > ffff88001f9f7170) > Stack: > ffff88001f9fbd34 ffff88001f9fbd54 0000000000000003 000000000000f100 > 0000000000000000 0000000000000003 0000000000000000 0000000000000003 > ffff88001fa6ddb0 ffffffff8140aa20 ffffffff81859a08 0000000000000000 > Call Trace: > [<ffffffff8140aa20>] ? gnttab_map+0x100/0x130 > [<ffffffff815c2765>] ? _raw_spin_lock+0x5/0x10 > [<ffffffff81083e01>] ? cpu_stopper_thread+0x101/0x190 > [<ffffffff8140e1f5>] ? xen_suspend+0x75/0xa0 > [<ffffffff81083f1b>] ? stop_machine_cpu_stop+0x8b/0xd0 > [<ffffffff81083e90>] ? cpu_stopper_thread+0x190/0x190 > [<ffffffff81083dd0>] ? cpu_stopper_thread+0xd0/0x190 > [<ffffffff815c0870>] ? schedule+0x270/0x6c0 > [<ffffffff81083d00>] ? copy_pid_ns+0x2a0/0x2a0 > [<ffffffff81065846>] ? kthread+0x96/0xa0 > [<ffffffff815c4024>] ? kernel_thread_helper+0x4/0x10 > [<ffffffff815c3436>] ? int_ret_from_sys_call+0x7/0x1b > [<ffffffff815c2be1>] ? retint_restore_args+0x5/0x6 > [<ffffffff815c4020>] ? gs_change+0x13/0x13 > Code: e8 f2 e9 ff ff 8b 44 24 10 44 89 e6 89 c7 e8 64 e8 ff ff ff c3 83 > fb 04 0f 84 95 fe ff ff 4a 8b 14 f5 20 95 85 81 e9 68 ff ff ff <0f> 0b > eb fe 0f 0b eb fe 48 8b 1d fd 00 42 00 4c 8d 6c 24 20 eb > RIP [<ffffffff8140d574>] xen_irq_resume+0x224/0x370 > RSP <ffff88001f9fbce0> > ---[ end trace 82e2e97d58b5f835 ]--- This seems to be taking the non-cancelled resume path, does this patch help at all: diff -r d7b14b76f1eb tools/libxl/libxl.c --- a/tools/libxl/libxl.c Thu Sep 22 14:26:08 2011 +0100 +++ b/tools/libxl/libxl.c Fri Sep 23 08:45:28 2011 +0100 @@ -246,7 +246,7 @@ int libxl_domain_resume(libxl_ctx *ctx, rc = ERROR_NI; goto out; } - if (xc_domain_resume(ctx->xch, domid, 0)) { + if (xc_domain_resume(ctx->xch, domid, 1)) { LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_domain_resume failed for domain %u", domid); I don't think that's a solution but if this patch works then it may indicate a problem with xc_domain_resume_any. [...] > I am not quite sure what you mean by "guest log". The guest console log, which you provided above, thanks. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |