[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Live vm migration broken in latest xen-unstable


  • To: "Tim Deegan" <Tim.Deegan@xxxxxxxxxxxxx>
  • From: "sanjay kushwaha" <sanjay.kushwaha@xxxxxxxxx>
  • Date: Fri, 8 Sep 2006 09:48:11 -0400
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 08 Sep 2006 06:48:34 -0700
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Yqrv2F4V8lr9qTRPQJuT8T6nG7x+uVZbFB3kNRUaTKuijyPI5+7jq2xpuzyJ+P3qVyqia3N55v0HzmrcC2wDudy5gDwdoDC6UHRaKiAW289Hfe+qxI7Xm8PMY6JWawf+gA2Xf1yirvGJFJdfNrv9McFIqpW+sKgvdrtk6VaNNWw=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi Tim,
Yes. This patch solved my problem. now live VM migration happens but I
am experiencing another problem. when I reattach to the migrated VM
console on the destination machine, I see the following traceback in
the dmesg of the guest VM.

------------[ cut here ]------------
kernel BUG at drivers/xen/netfront/netfront.c:717!
invalid opcode: 0000 [#1]
SMP
Modules linked in:
CPU:    0
EIP:    0061:[<c02598e0>]    Not tainted VLI
EFLAGS: 00010082   (2.6.16.13-xenU #15)
EIP is at network_alloc_rx_buffers+0x470/0x4b0
eax: c056fc80   ebx: ccbc0d80   ecx: d1001040   edx: c05d0000
esi: c05d02a0   edi: c05d034c   ebp: c0551f14   esp: c0551eac
ds: 007b   es: 007b   ss: 0069
Process xenwatch (pid: 8, threadinfo=c0550000 task=c057a540)
Stack: <0>00000208 00000000 0000cbc1 00000000 c05d034c c05d31b8
c05d0000 00000000
      00000337 00000337 0000002f 00000208 ccbc1000 000000d1 cdd25838 00000100
      000000d1 00000011 c05d0000 00000001 c02f8013 c05d0000 00000001 c05d02a0
Call Trace:
[<c01058ed>] show_stack_log_lvl+0xcd/0x120
[<c0105aeb>] show_registers+0x1ab/0x240
[<c0105e11>] die+0x111/0x240
[<c0106178>] do_trap+0x98/0xe0
[<c0106491>] do_invalid_op+0xa1/0xb0
[<c01052c7>] error_code+0x2b/0x30
[<c025a554>] backend_changed+0x1a4/0x250
[<c025517e>] otherend_changed+0x7e/0x90
[<c0253361>] xenwatch_handle_callback+0x21/0x60
[<c025345d>] xenwatch_thread+0xbd/0x160
[<c0132b0c>] kthread+0xec/0xf0
[<c0102c45>] kernel_thread_helper+0x5/0x10
Code: 82 04 15 00 00 8d 9e f8 14 00 00 e8 db 78 ea ff 8b 5d d8 39 9a
fc 14 00 00 0f 84 16 fe ff ff e9 51 ff ff ff 8d b4 26 00 00 00 00 <0f>
0b cd 02 2c a0 2f c0 e9 cd fc ff ff 0f 0b d1 02 2c a0 2f c0

[root@localhost ~]#


Does anyone know if this is a known problem?

Thanks for your help.
Sanjay

On 9/7/06, Tim Deegan <Tim.Deegan@xxxxxxxxxxxxx> wrote:
Hi Sanjay,

Does the attached patch fix this problem for you?  It tries to make sure
there is enough spare memory to enable shadow pagetables on the domain
before starting the migration.

Cheers,

Tim

At 15:09 -0400 on 05 Sep (1157468999), sanjay kushwaha wrote:
> Hi Ewan,
> I did a "hg pull -u" on my tree which also got the changeset 11422. but I am
> still facing the same problem. btw this changeset seems to be specific to
> hvm domain while I am facing this problem with paravirtualized domain.
>
> Thanks,
> Sanjay
>
> On 9/5/06, Ewan Mellor <ewan@xxxxxxxxxxxxx> wrote:
> >
> >On Fri, Sep 01, 2006 at 05:36:00PM -0400, sanjay kushwaha wrote:
> >
> >> Folks,
> >> I am experiencing that live migration is not working in latest
> >> xen-unstable. I get the following message during migration
> >>
> >> [root@pc5 ksanjay]# xm migrate --live 1 [1]199.77.138.23
> >> Error: /usr/lib/xen/bin/xc_save 18 1 0 0 1 failed
> >> [root@pc5 ksanjay]#
> >>
> >> I traced the problem to a function in xen named set_sh_allocation() in
> >> file xen/arch/x86/mm/shadow/common.c
> >>
> >> tools/libxc/xc_linux_save.c:xc_linux_save() is called from the python
> >> script which makes the following hypercall
> >>
> >>     if (live) {
> >>         if (xc_shadow_control(xc_handle, dom,
> >>                               XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY,
> >>                               NULL, 0, NULL, 0, NULL) < 0) {
> >>             ERR("Couldn't enable shadow mode");
> >>             goto out;
> >>         }
> >>         last_iter = 0;
> >>     } else {
> >> -----------
> >>
> >> this particular hypercall leads to the call of set_sh_allocation which
> >> fails in the following code
> >>
> >>         if ( d-> arch.shadow.total_pages < pages )
> >>         {
> >>             /* Need to allocate more memory from domheap */
> >>             pg = alloc_domheap_pages(NULL, SHADOW_MAX_ORDER, 0);
> >>             if ( pg == NULL )
> >>             {
> >>                 SHADOW_PRINTK("failed to allocate shadow pages.\n");
> >>                 return -ENOMEM;
> >>             }
> >>
> >> alloc_domheap_pages fails and returns NULL. however I think I have
> >enough
> >> memory available so this function should not fail.
> >>
> >> Is there anybody else experiencing the same problem? Could someone
> >please
> >> tell me how to fix it?
> >
> >I've put some changes into xen-unstable today which might help.  The last
> >fix
> >is on its way through testing now.  Look out for xen-unstable changeset
> >11422, and try that, see how you get on.
> >
> >Cheers,
> >
> >Ewan.
> >
>
>
>
> --
> ----------------------
> PhD Student, Georgia Tech
> http://www.cc.gatech.edu/~ksanjay/ <http://www.cc.gatech.edu/%7Eksanjay/>

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel






--
----------------------
PhD Student, Georgia Tech
http://www.cc.gatech.edu/~ksanjay/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.