[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/2] xen-blkfront: don't call talk_to_blkback when already connected to blkback
On 06/07/2016 11:25 PM, Konrad Rzeszutek Wilk wrote: > On Wed, Jun 01, 2016 at 01:49:23PM +0800, Bob Liu wrote: >> >> On 06/01/2016 04:33 AM, Konrad Rzeszutek Wilk wrote: >>> On Tue, May 31, 2016 at 04:59:16PM +0800, Bob Liu wrote: >>>> Sometimes blkfont may receive twice blkback_changed() notification after >>>> migration, then talk_to_blkback() will be called twice too and confused >>>> xen-blkback. >>> >>> Could you enlighten the patch description by having some form of >>> state transition here? I am curious how you got the frontend >>> to get in XenbusStateConnected (via blkif_recover right) and then >>> the backend triggering the update once more? >>> >>> Or is just a simple race - the backend moves from XenbusStateConnected-> >>> XenbusStateConnected - which retriggers the frontend to hit in >>> blkback_changed the XenbusStateConnected state and go in there? >>> (That would be in conenct_ring changing the state). But I don't >>> see how the frontend_changed code get there as we have: >>> >>> 770 /* >>> 771 * Ensure we connect even when two watches fire in >>> 772 * close succession and we miss the intermediate value >>> 773 * of frontend_state. >>> 774 */ >>> 775 if (dev->state == XenbusStateConnected) >>> 776 break; >>> 777 >>> >>> ? >>> >>> Now what about 'blkfront_connect' being called on the second time? >>> >>> Ah, info->connected is probably by then in BLKIF_STATE_CONNECTED >>> (as blkif_recover changed) and we just reread the size of the disk. >>> >>> Is that how about the flow goes? >> >> blkfront blkback >> blkfront_resume() >> > talk_to_blkback() >> > Set blkfront to XenbusStateInitialised >> Front changed() >> > Connect() >> > Set blkback to >> XenbusStateConnected >> >> blkback_changed() >> > Skip talk_to_blkback() >> because frontstate == XenbusStateInitialised >> > blkfront_connect() >> > Set blkfront to XenbusStateConnected >> >> >> ------------------------------------------------------------------ >> But sometimes blkfront receives >> blkback_changed() event more than once! > > I think I know why. The udev scripts that get invoked when when > we attach a disk are a bit custom. As such I think they just > revalidate the size leading to this. > > And this 'poke-at-XenbusStateConnected' state multiple times > is allowed. It is used to signal disk changes (or just to revalidate). > Hence it does not matter why really - we need to deal with this. > > I modified your patch a bit and are testing it: > Looks much better, thank you very much! Bob > From e49dc9fc65eda4923b41d903ac51a7ddee182bcd Mon Sep 17 00:00:00 2001 > From: Bob Liu <bob.liu@xxxxxxxxxx> > Date: Tue, 7 Jun 2016 10:43:15 -0400 > Subject: [PATCH] xen-blkfront: don't call talk_to_blkback when already > connected to blkback > > Sometimes blkfront may twice receive blkback_changed() notification > (XenbusStateConnected) after migration, which will cause > talk_to_blkback() to be called twice too and confuse xen-blkback. > > The flow is as follow: > blkfront blkback > blkfront_resume() > > talk_to_blkback() > > Set blkfront to XenbusStateInitialised > front changed() > > Connect() > > Set blkback to > XenbusStateConnected > > blkback_changed() > > Skip talk_to_blkback() > because frontstate == XenbusStateInitialised > > blkfront_connect() > > Set blkfront to XenbusStateConnected > > ----- > And here we get another XenbusStateConnected notification leading > to: > ----- > blkback_changed() > > because now frontstate != XenbusStateInitialised > talk_to_blkback() is also called again > > blkfront state changed from > XenbusStateConnected to XenbusStateInitialised > (Which is not correct!) > > front_changed(): > > Do nothing because blkback > already in > XenbusStateConnected > > Now blkback is in XenbusStateConnected but blkfront is still > in XenbusStateInitialised - leading to no disks. > > Poking of the XenbusStateConnected state is allowed (to deal with > block disk change) and has to be dealt with. The most likely > cause of this bug are custom udev scripts hooking up the disks > and then validating the size. > > Signed-off-by: Bob Liu <bob.liu@xxxxxxxxxx> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > --- > drivers/block/xen-blkfront.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > index b4b8fbd..7765ad5 100644 > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c > @@ -2484,10 +2484,23 @@ static void blkback_changed(struct xenbus_device *dev, > break; > > case XenbusStateConnected: > - if (dev->state != XenbusStateInitialised) { > + /* > + * talk_to_blkback sets state to XenbusStateInitialised > + * and blkfront_connect sets it to XenbusStateConnected > + * (if connection went OK). > + * > + * If the backend (or toolstack) decides to poke at backend > + * state (and re-trigger the watch by setting the state > repeatedly > + * to XenbusStateConnected (4)) we need to deal with this. > + * This is allowed as this is used to communicate to the guest > + * that the size of disk has changed! > + */ > + if ((dev->state != XenbusStateInitialised) && > + (dev->state != XenbusStateConnected)) { > if (talk_to_blkback(dev, info)) > break; > } > + > blkfront_connect(info); > break; > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |