[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH] xen-netback: Check for hotplug-status existence before watching
> -----Original Message----- > From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > Sent: 10 May 2021 20:43 > To: Michael Brown <mbrown@xxxxxxxxxxxxxxxx>; paul@xxxxxxx > Cc: paul@xxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; > wei.liu@xxxxxxxxxx; Durrant, > Paul <pdurrant@xxxxxxxxxxxx> > Subject: RE: [EXTERNAL] [PATCH] xen-netback: Check for hotplug-status > existence before watching > > On Mon, May 10, 2021 at 08:06:55PM +0100, Michael Brown wrote: > > If you have a suggested patch, I'm happy to test that it doesn't reintroduce > > the regression bug that was fixed by this commit. > > Actually, I've just tested with a simple reloading xen-netfront module. It > seems in this case, the hotplug script is not re-executed. In fact, I > think it should not be re-executed at all, since the vif interface > remains in place (it just gets NO-CARRIER flag). > > This brings a question, why removing hotplug-status in the first place? > The interface remains correctly configured by the hotplug script after > all. From the commit message: > > xen-netback: remove 'hotplug-status' once it has served its purpose > > Removing the 'hotplug-status' node in netback_remove() is wrong; the > script > may not have completed. Only remove the node once the watch has fired and > has been unregistered. > > I think the intention was to remove 'hotplug-status' node _later_ in > case of quickly adding and removing the interface. Is that right, Paul? The removal was done to allow unbind/bind to function correctly. IIRC before the original patch doing a bind would stall forever waiting for the hotplug status to change, which would never happen. > In that case, letting hotplug_status_changed() remove the entry wont > work, because the watch was unregistered few lines earlier in > netback_remove(). And keeping the watch is not an option, because the > whole backend_info struct is going to be free-ed already. > > If my guess about the original reason for the change is right, I think > it should be fixed at the hotplug script level - it should check if the > device is still there before writing 'hotplug-status' node. > I'm not sure if doing it race-free is possible from a shell script (I think it > requires doing xenstore read _and_ write in a single transaction). But > in the worst case, the aftermath of loosing the race is leaving stray > 'hotplug-status' xenstore node - not ideal, but also less harmful than > failing to bring up an interface. At this point, the toolstack could cleanup > it later, perhaps while setting up that interface again (if it gets > re-connected)? > > Anyway, perhaps the best thing to do now, is to revert both commits, and > think of an alternative solution for the original issue? That of course > assumes I guessed correctly why it was done in the first place... > Simply reverting everything would likely break the ability to do unbind and bind (which is useful e.g to allow update the netback module whilst guests are still running) so I don't think that's an option. Paul
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |