[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-users] Error: Device 0 (vif) could notbeconnected. Hotplugscripts not working
Ok, now this is starting to get interesting. I previously had
xen-netback statically compiled into the kernel.
It's hard to debug static drivers, so I changed it to compile as a
module. And lo and behold, the kernel oops disappeared.
It is not a stable solution though. Sometimes I still get the same
oops, it seems to be a race condition.
I'm running 2.6.32.14 from Jeremy's xen/stable-2.6.32.x
On 01.06.2010 11:53, Helmut Wieser wrote:
Doesn't seem to make a difference.
I even downgraded to udevd 141, no change.
I found my problem here, and applied the patch from
http://lists.xensource.com/archives/html/xen-devel/2010-05/msg01462.html
But as it's incomplete it didn't help me with my configuration.
I even tried to compile 2.6.32.14 and still have the same issue.
This is the relevant part of my drivers/xen/netback/netbus.c:
static int netback_uevent(struct xenbus_device *xdev, struct
kobj_uevent_env *env)
{
struct backend_info *be;
struct xen_netif *netif;
char *val;
DPRINTK("netback_uevent");
be = dev_get_drvdata(&xdev->dev);
if (!be)
return 0;
netif = be->netif;
val = xenbus_read(XBT_NIL, xdev->nodename, "script", NULL);
if (IS_ERR(val)) {
int err = PTR_ERR(val);
xenbus_dev_fatal(xdev, err, "reading script");
return err;
}
else {
if (add_uevent_var(env, "script=%s", val)) {
kfree(val);
return -ENOMEM;
}
kfree(val);
}
if (add_uevent_var(env, "vif=%s", netif->dev->name))
return -ENOMEM;
return 0;
}
This is the dmesg when I start a hvm domU for the first time:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000110
IP: [<ffffffff8123610a>] netback_uevent+0x8e/0xbf
PGD 1e2bc067 PUD 1dd03067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/vif-1-0/uevent
CPU 7
Modules linked in: bridge stp llc ipv6 xen_netfront firewire_sbp2
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
snd_timer snd tpm_tis soundcore tpm serio_raw snd_page_alloc pcspkr
tpm_bios wmi firewire_ohci usb_storage firewire_core crc_itu_t tg3
floppy [last unloaded: scsi_wait_scan]
Pid: 2141, comm: udevd Not tainted 2.6.32.14 #6 HP Z600 Workstation
RIP: e030:[<ffffffff8123610a>] [<ffffffff8123610a>]
netback_uevent+0x8e/0xbf
RSP: e02b:ffff88001d21fda8 EFLAGS: 00010246
RAX: 00200000000000c1 RBX: ffff88001cccde00 RCX: 0000000000800046
RDX: ffff88001d7e3b00 RSI: ffffea00006739a8 RDI: 00200000000002c0
RBP: ffff88001d21fdc8 R08: 0000000000000000 R09: ffffffff815c7cf0
R10: ffff88001e292904 R11: ffff88001e292154 R12: ffff88001e292000
R13: 0000000000000000 R14: ffff88001d7e3b80 R15: ffff88001e7be000
FS: 00007f7591154790(0000) GS:ffff880002ca2000(0000)
knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000110 CR3: 000000001d240000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process udevd (pid: 2141, threadinfo ffff88001d21e000, task
ffff88001e6a16e0)
Stack:
ffff88001cccde40 ffff88001e292000 ffff88001cccde00 ffffffff815fe0e8
<0> ffff88001d21fdf8 ffffffff8122badc ffff88001cccde40
ffff88001e292000
<0> ffff88001fc49f60 ffff88001cccde50 ffff88001d21fe28
ffffffff81266772
Call Trace:
[<ffffffff8122badc>] xenbus_uevent_backend+0x90/0xab
[<ffffffff81266772>] dev_uevent+0x102/0x146
[<ffffffff81267459>] show_uevent+0x81/0xd8
[<ffffffff81266434>] dev_attr_show+0x22/0x49
[<ffffffff810a0e41>] ? __get_free_pages+0x9/0x46
[<ffffffff8112561c>] sysfs_read_file+0xac/0x12e
[<ffffffff810d459f>] vfs_read+0xa6/0x103
[<ffffffff810d46b2>] sys_read+0x45/0x69
[<ffffffff81012a82>] system_call_fastpath+0x16/0x1b
Code: c6 79 48 53 81 31 c0 4c 89 e7 e8 ea 03 f8 ff 85 c0 74 10 4c 89 f7
41 bc f4 ff ff ff e8 99 3e e9 ff eb 2d 4c 89 f7 e8 8f 3e e9 ff
<49> 8b 95 10 01 00 00 4c 89 e7 31 c0 48 c7 c6 83 48 53 81 41 bc
RIP [<ffffffff8123610a>] netback_uevent+0x8e/0xbf
RSP <ffff88001d21fda8>
CR2: 0000000000000110
---[ end trace 4f88c9bf70342ee1 ]---
I don't get it, because the patch is supposed to prevent null pointers.
Either xdev itself is corrupt, or returning corrupt data.
I'm stumped.
On 01.06.2010 09:03, Helmut Wieser wrote:
No joy. I couldn't find out what CONFIG_XEN_SYSFS does, but it doesn't
seem to be part of 2.6.31.13. I set all the other options apart from
wireless that you suggested.
I'll try to use Zhang Enming's kernel config next.
Oh, and of course I get the infamous oops from bug 1612, I just never
noticed it because my console doesn't work with gfx passthru.
Here's the output:
[ 167.571125] alloc irq_desc for 826 on node 0
[ 167.571131] alloc kstat_irqs on node 0
[ 167.724755] BUG: unable to handle kernel NULL pointer dereference at
0000000000000110
[ 167.724943] IP: [<ffffffff8124ad7c>] netback_uevent+0x90/0xd5
[ 167.725066] PGD 1d5e6067 PUD 1ddd2067 PMD 0
[ 167.725296] Oops: 0000 [#1] SMP
[ 167.725472] last sysfs file: /sys/devices/vif-1-0/uevent
[ 167.725544] CPU 2
[ 167.725653] Modules linked in: bridge stp xenfs blktap pci_hotplug
xen_blkfront xen_netfront xen_evtchn loop firewire_sbp2
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep usbhid
snd_pcm snd_timer snd hid wmi processor soundcore acpi_processor pcspkr
psmouse snd_page_alloc serio_raw button evdev ext3 jbd mbcache
usb_storage sr_mod sd_mod crc_t10dif cdrom tg3 firewire_ohci floppy
thermal ahci firewire_core libphy libata thermal_sys crc_itu_t scsi_mod
uhci_hcd ehci_hcd usbcore nls_base [last unloaded: scsi_wait_scan]
[ 167.728264] Pid: 1955, comm: udevd Not tainted 2.6.31.13 #3 HP Z600
Workstation
[ 167.728356] RIP: e030:[<ffffffff8124ad7c>]
[<ffffffff8124ad7c>] netback_uevent+0x90/0xd5
[ 167.728497] RSP: e02b:ffff880002095d98 EFLAGS: 00010246
[ 167.728569] RAX: 0000000000000000 RBX: ffff88000231ea00 RCX:
000000000080007c
[ 167.728644] RDX: ffff88001d3ab440 RSI: 00000000a3c9a148 RDI:
01000000000002c0
[ 167.728722] RBP: ffff88000244e000 R08: 0000000000000000 R09:
0000000000000000
[ 167.728797] R10: ffffffff8100eddf R11: 00000000a3c9a148 R12:
0000000000000000
[ 167.728873] R13: ffff88001d3ab680 R14: ffff88001d5b9000 R15:
ffffffff81502b30
[ 167.728955] FS: 00007fdb8f1c5790(0000) GS:ffffc9000002e000(0000)
knlGS:0000000000000000
[ 167.729051] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 167.729128] CR2: 0000000000000110 CR3: 000000001dcfc000 CR4:
0000000000002660
[ 167.729212] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 167.729303] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 167.729405] Process udevd (pid: 1955, threadinfo ffff880002094000,
task ffff88001d5206c0)
[ 167.729525] Stack:
[ 167.729609] ffffffff814608a6 00000000a3c9a148 0000000000000002
ffff88000231ea40
[ 167.729835] <0> ffff88000244e000 ffff88000231ea50
ffff88000231ea50 ffffffff81283264
[ 167.730179] <0> 00007fdb00000010 00000000a3c9a148
ffff880002095f50 0000000000000000
[ 167.730566] Call Trace:
[ 167.730639] [<ffffffff81283264>] ? dev_uevent+0x1a2/0x207
[ 167.730714] [<ffffffff81284725>] ? show_uevent+0x92/0xfd
[ 167.730790] [<ffffffff81282e8b>] ? dev_attr_show+0x2e/0x6b
[ 167.730873] [<ffffffff810ce1b0>] ? get_zeroed_page+0x21/0x76
[ 167.730957] [<ffffffff811647fb>] ? sysfs_read_file+0xbb/0x156
[ 167.731051] [<ffffffff8100e301>] ?
xen_force_evtchn_callback+0x1d/0x37
[ 167.731133] [<ffffffff8100edf2>] ? check_events+0x12/0x20
[ 167.731209] [<ffffffff81105f91>] ? vfs_read+0xb1/0x123
[ 167.731285] [<ffffffff8100eddf>] ?
xen_restore_fl_direct_end+0x0/0x1
[ 167.731366] [<ffffffff81384013>] ?
_spin_unlock_irqrestore+0x24/0x3e
[ 167.731445] [<ffffffff811060eb>] ? sys_read+0x55/0x90
[ 167.731527] [<ffffffff81013e42>] ?
system_call_fastpath+0x16/0x1b
[ 167.731619] Code: c7 c6 33 1f 45 81 31 c0 48 89 ef e8 17 f0 f7 ff 85
c0 74 0f 4c 89 ef bd f4 ff ff ff e8 66 3e eb ff eb 2b 4c 89 ef e8 5c 3e
eb ff <49> 8b 94 24 10 01 00 00 48 89 ef 31 c0 48 c7 c6 3d 1f 45
81 e8
[ 167.734689] RIP [<ffffffff8124ad7c>] netback_uevent+0x90/0xd5
[ 167.734818] RSP <ffff880002095d98>
[ 167.734890] CR2: 0000000000000110
[ 167.734966] ---[ end trace d844f79248755c84 ]---
[ 167.927689] device vif1.0 entered promiscuous mode
[ 167.938021] eth0: port 2(vif1.0) entering forwarding state
[ 168.100416] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 168.316187] nf_conntrack version 0.5.0 (4044 buckets, 16176 max)
[ 168.316778] CONFIG_NF_CT_ACCT is deprecated and will be removed
soon. Please use
[ 168.316873] nf_conntrack.acct=1 kernel parameter, acct=1
nf_conntrack module option or
[ 168.316966] sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
[ 168.427110] physdev match: using --physdev-out in the OUTPUT,
FORWARD and POSTROUTING chains for non-bridged traffic is not supported
anymore.
[ 168.620471] tun: Universal TUN/TAP device driver, 1.6
[ 168.620550] tun: (C) 1999-2004 Max Krasnyansky <maxk@xxxxxxxxxxxx>
[ 168.668587] device tap1.0 entered promiscuous mode
[ 168.668694] eth0: port 3(tap1.0) entering forwarding state
[ 168.688096] alloc irq_desc for 825 on node 0
[ 168.688170] alloc kstat_irqs on node 0
But the machine comes up for the first time and everything seems to be
working fine.
I use udevd 151.
On 31.05.2010 16:27, Niels Dettenbach wrote:
Am Montag 31 Mai 2010, 16:13:14 schrieb Helmut Wieser:
No, this doesn't help.
I'm currently trying to ditch the debian kernel and compiling one of
jeremy's kernels with a config close to the one from debian.
...you may try this:
1.) make shure udev is <=151 (i use 141 currently)
2.) set in your xen kernel (if not):
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
CONFIG_ACPI_SYSFS_POWER=y
CONFIG_WIRELESS_EXT_SYSFS=y *
CONFIG_GPIO_SYSFS=y *
CONFIG_VIDEO_PVRUSB2_SYSFS=y
CONFIG_RTC_INTF_SYSFS=y
CONFIG_XEN_SYSFS=y
CONFIG_SYSFS=y
(* only if applies to your hardware)
(not shure if it's optimal but seems to work for me with 3.4x and 4.x)
=> reboot
3.) make a
mount -t sysfs sys /sys
=> if you still have any sysfs mounted you might try to unmount it before this
step
I have a line
sys /sys sysfs auto 0 0
in my fstab which seems to help...
May be this is widely waste but it seems to help me - so pls don't hit me...
;)
Another thing is that you might have fractions of your (to new) udev config
from before downgrading.
I'm working with gentoo which compiles things as i want so i'm not fully in
the view what your distributor and package management might does well and what
not with your (udev) configs...
may be this helps,
Niels.
-
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|