[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] xen-netfront: delay gARP until backend switches to Connected
On Wed, 2011-07-13 at 17:11 +0100, Laszlo Ersek wrote: > On 07/13/11 15:29, Konrad Rzeszutek Wilk wrote: > > On Wed, Jul 13, 2011 at 01:44:47PM +0200, Laszlo Ersek wrote: > >> In addition to backporting 43223efd9bfd to the RHEL-5 host side, we needed > >> the following in the RHEL-6 guest, in order to fix the network outage after > >> live migration. I also tested a Fedora-15 guest (without the patch), and > >> the > > > > Laszlo, > > > > This description is .. well, pointless for upstream patches. Just succinctly > > describe the problem, how to reproduce it, and what this patch does. > > I avoided writing up the commit message myself because in my first posting [1] > I quoted two paragraphs from Ian Campbell [2]. I intended those two cited > paragraphs verbatim as the commit message (along with my short remarks), > because they describe the problem exactly. Please don't expect/require that maintainers trawl around and construct a commit message for you, always propose the full text you would like to see committed in each patch posting. > Anyway: > > After a guest is live migrated, the xen-netfront driver emits a gratuitous ARP > message, so that networking hardware on the target host's subnet can take > notice, and public routing to the guest is re-established. However, if the > packet appears on the backend interface before the backend is added to the > target host's bridge, the packet is lost, and the migrated guest's peers > become > unable to talk to the guest. > > A sufficient two-parts condition to prevent the above is: > > (1) ensure that the backend only moves to Connected xenbus state after its > hotplug scripts completed, ie. the netback interface got added to the bridge; > and > > (2) ensure the frontend only queues the gARP when it sees the backend move to > Connected. > > These two together provide complete ordering. Sub-condition (1) is satisfied > by pvops commit 43223efd9bfd. > > In general, the full condition is sufficient, not necessary, because, > according > to [2], live migration has been working for a long time without satisfying > sub-condition (2). However, after 43223efd9bfd was backported to the RHEL-5 > host to ensure (1), (2) still proved necessary in the RHEL-6 guest. This patch > intends to provide (2) for upstream. I expect that 43223efd9bfd just reduces the window (enough to prevent it occuring most of the time) but doesn't actually close it. This change seems good to me. > [1] http://lists.xensource.com/archives/html/xen-devel/2011-07/msg00327.html > [2] http://lists.xensource.com/archives/html/xen-devel/2011-06/msg01969.html > > Signed-off-by: Laszlo Ersek <lersek@xxxxxxxxxx> Reviewed-by: Ian Campbell <ian.campbell@xxxxxxxxxx> > --- > drivers/net/xen-netfront.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index d29365a..f033656 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -1646,7 +1646,6 @@ static void netback_changed(struct xenbus_device *dev, > case XenbusStateInitialised: > case XenbusStateReconfiguring: > case XenbusStateReconfigured: > - case XenbusStateConnected: > case XenbusStateUnknown: > case XenbusStateClosed: > break; > @@ -1657,6 +1656,9 @@ static void netback_changed(struct xenbus_device *dev, > if (xennet_connect(netdev) != 0) > break; > xenbus_switch_state(dev, XenbusStateConnected); > + break; > + > + case XenbusStateConnected: > netif_notify_peers(netdev); > break; > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |