[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] netback BUG_ON when using copy_skb=1
>>> On 26.10.13 at 10:32, jerry <jerry.lilijun@xxxxxxxxxx> wrote: > The reason why the vif net-device isn't released after shutting down VM was > found with copy_skb disabled. > Let it be supposed that VM1(vif1.0) sends packets to VM2(vif2.0) by virtual > switch. > 1) The VM2's OS is windows 2003 and has been shutdown before for some > unexpected reason. > After being created, this VM2 stopped the starting process at the prompt > windows named "Shutdown Event Tracker". > It is waiting for users to input some messages for the question why the > computer shut down unexpectedly. > > 2) The VM2 already has vif2.0 created. Then I added a new vif net-device > using virsh commands. > The new vif2.1 was not completely created with no interrupts, but its > state is running and tx queues is started as default. > The function connect() in xenbus.c hasn't been called for vif2.1. The > related information in xenstore is as follows: > linux-szRoyS:/ # xenstore-ls -f | grep 2 | grep state > /local/domain/0/device-model/2/state = "running" > /local/domain/0/backend/vbd/2/51712/state = "4" > /local/domain/0/backend/vbd/2/51760/state = "4" > /local/domain/0/backend/vif/2/0/state = "4" > /local/domain/0/backend/vif/2/1/state = "2" > /local/domain/0/backend/console/2/0/state = "1" > /local/domain/2/control/uvp/vm_state = "running" > /local/domain/2/device/vbd/51712/state = "4" > /local/domain/2/device/vbd/51760/state = "4" > /local/domain/2/device/vif/0/state = "4" > /local/domain/2/device/vif/1/state = "1" > > 3) The KOBJ_ONLINE message was generated in function backend_create_netif() > called in netback_probe(). > This event will invoke network script named "vif-bridge" executing and > add vif2.1 to virtual switch. > Then packets from vif1.0(VM1) will be forwarded or flooded to vif2.1 by > virtual switch. > The vif2.1 dropped this packets because its not netif_schedulable() in > function netif_be_start_xmit(). > > 4) After setting vif2.1 to down and then to up, the TX queue can't be > started in net_open() with carrier off. > So its qdisc became fifo_qdic and the TX queue state stopped. > In this case, the packets will be held in qdisc queue and can't be > dequeued in function dequeue_skb() > for vif2.1's stopped TX queues. > > 5) If VM1 was destroyed, the packets from vif1.0 can't be released and > vif1.0 can't be disconnected. > The vif1.0 will be remained unreleased until setting vif2.1 to down. > > This problem is mainly because that vif2.1 was not created successfully > and got in a strange state: > running but TX queue is stopped. The function backend_create_netif() is > called in two place netback_probe() and > frontend_changed(). I think we can remove the backend_create_netif() call > in netback_probe(). > So we can make sure the vif net-device created completely after front-end > changed to XenbusStateConnected. > > The patch is as follows: > --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 > +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 > @@ -156,9 +156,6 @@ > if (err) > goto fail; > > - /* This kicks hotplug scripts, so do it immediately. */ > - backend_create_netif(be); > - > return 0; > > abort_transaction: > > Do you have some ideas? No, not really. Would be helpful if this could be matched up to behavior (and eventual changes thereto) of the upstream driver. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |