Xen project Mailing List

Re: [Xen-users] block-attach crashes domU

From: Pasi Kärkkäinen <pasik@xxxxxx>

Date: Fri, 22 Jan 2010 10:00:18 +0200

Delivery-date: Fri, 22 Jan 2010 00:01:59 -0800

List-id: Xen user discussion <xen-users.lists.xensource.com>

On Thu, Jan 21, 2010 at 10:21:26PM -0800, Tracy Reed wrote: > I have run into a strange situation where a domain will not boot with > a certain disk specified in the config file and trying to block-attach > it after it starts results in the domain disappearing from the list > and presumably simply crashing. > > I am running CentOS 5.4 with kernel 2.6.18-160.el5xen x86_64 > > For months everything worked perfectly with with these domains using > an AoE SAN for the back-end. I have used this sort of setup for > several years and it is great. But these domains in particular have > been running for several months. Then 3 of the 4 domU's I run were > really heavily slammed and became unresponsive and I ended up having > to do an xm destroy on them. After that they refuse to come back > up. One of my domU's has not been rebooted and it continues to work > great with all 4 disk devices attached. > > Here is my domU config file: > > name = "db2" > uuid = "f253cab5-c3de-c1f7-e735-5d4f0bfcd3ff" > maxmem = 16384 > memory = 2048 > vcpus = 4 > bootloader = "/usr/bin/pygrub" > on_poweroff = "destroy" > on_reboot = "restart" > on_crash = "restart" > vfb = [ ] > disk = [ "phy:/dev/etherd/e1.12,xvda,w", "phy:/dev/etherd/e2.12,xvdb,w", > "phy:/dev/etherd/e3.1,xvdc,w", "phy:/dev/etherd/e4.1,xvdd,w" ] > vif = [ "mac=00:16:3e:5b:5c:dd,bridge=dmz" ] > > If I boot the domU with this config file I get the following on boot: > > Red Hat nash version 5.1.19.6 starting > Mounting proc filesystem > Mounting sysfs filesystem > Creating /dev > Creating initial device nodes > Setting up hotplug. > Creating block device nodes. > Loading ehci-hcd.ko module > Loading ohci-hcd.ko module > Loading uhci-hcd.ko module > USB Universal Host Controller Interface driver v3.0 > Loading jbd.ko module > Loading ext3.ko module > Loading raid1.ko module > md: raid1 personality registered for level 1 > Loading xenblk.ko module > Registering block device major 202 > xvda: xvda1 xvda2 xvda3 xvda4 < xvda5 > > xvdb: xvdb1 xvdb2 xvdb3 xvdb4 < xvdb5 > > xvdc: xvdc1 > kobject_add failed for xvda with -EEXIST, don't try to register things with > the same name in the same directory. > > Call Trace: > [<ffffffff803404ea>] kobject_add+0x170/0x19b > [<ffffffff8025cfd5>] exact_lock+0x0/0x14 > [<ffffffff8029b92c>] keventd_create_kthread+0x0/0xc4 > [<ffffffff802fb4e2>] register_disk+0x43/0x190 > [<ffffffff8029b92c>] keventd_create_kthread+0x0/0xc4 > [<ffffffff80336c3a>] add_disk+0x34/0x3d > [<ffffffff88084ec9>] :xenblk:backend_changed+0x110/0x193 > [<ffffffff803b32fa>] xenbus_read_driver_state+0x26/0x3b > [<ffffffff803b4bdb>] xenwatch_thread+0x0/0x135 > [<ffffffff803b402d>] xenwatch_handle_callback+0x15/0x48 > [<ffffffff803b4cf7>] xenwatch_thread+0x11c/0x135 > [<ffffffff8029bb44>] autoremove_wake_function+0x0/0x2e > [<ffffffff8029b92c>] keventd_create_kthread+0x0/0xc4 > [<ffffffff80233bcd>] kthread+0xfe/0x132 > [<ffffffff80260b2c>] child_rip+0xa/0x12 > [<ffffffff8029b92c>] keventd_create_kthread+0x0/0xc4 > [<ffffffff80233acf>] kthread+0x0/0x132 > [<ffffffff80260b22>] child_rip+0x0/0x12 > > Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP: > [<ffffffff802fe512>] create_dir+0x11/0x1cf > PGD 7f1c9067 PUD 7f1ca067 PMD 0 > Oops: 0000 [1] SMP > last sysfs file: /block/ram0/dev > CPU 1 > Modules linked in: xenblk raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd > Pid: 9, comm: xenwatch Not tainted 2.6.18-164.el5xen #1 > RIP: e030:[<ffffffff802fe512>] [<ffffffff802fe512>] create_dir+0x11/0x1cf > RSP: e02b:ffff880000fbfda0 EFLAGS: 00010282 > RAX: ffff88007f31b870 RBX: ffff88007f3cd4f0 RCX: ffff880000fbfdd8 > RDX: ffff88007f3cd4f8 RSI: 0000000000000000 RDI: ffff88007f3cd4f0 > RBP: ffff88007f3cd4f0 R08: 0000000000000001 R09: ffff88000114c000 > R10: ffffffff8029b92c R11: ffff880000fbfbb0 R12: ffff88007f3cd4f0 > R13: ffff880000fbfdd8 R14: 0000000000000000 R15: ffff88007f31b870 > FS: 0000000000000000(0000) GS:ffffffff805ca080(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 > > The /dev/etherd/e4.1 backend to the xvdd device is present in the dom0 > and works perfectly. I can access it from within the dom0 with no > problem. > > Something is confused. I would really like to avoid rebooting the > dom0's if at all possible. > > I have found that if I remove the "phy:/dev/etherd/e4.1,xvdd,w" from > the disk = line the domU boots fine. But if I try to block-attach the > missing device the domU dies instantly. > > I have been looking for logs that might explain something about why it > died but I cannot find anything relevant. I have googled the "don't > try to register thigns with the same name in the same directory" error > and found a few references to it but none in the context of xen. > > Any advice would be greatly appreciated. > Does it work if you attach some local LVM volume or file image (non-AOE) as xvdd? Do you get errors in dom0 "dmesg"? How about dom0 /var/log/messages? Do you get errors in dom0 "xm log" ? How about "xm dmesg"? -- Pasi _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.