|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen-netback: Check for hotplug-status existence before watching
On 17/05/2021 22:43, Marek Marczykowski-Górecki wrote: On Tue, May 11, 2021 at 12:46:38PM +0000, Durrant, Paul wrote:-----Original Message----- From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> Sent: 11 May 2021 11:45 To: Durrant, Paul <pdurrant@xxxxxxxxxxxx> Cc: Michael Brown <mbrown@xxxxxxxxxxxxxxxx>; paul@xxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; wei.liu@xxxxxxxxxx Subject: RE: [EXTERNAL] [PATCH] xen-netback: Check for hotplug-status existence before watching On Tue, May 11, 2021 at 12:40:54PM +0200, Marek Marczykowski-Górecki wrote:On Tue, May 11, 2021 at 07:06:55AM +0000, Durrant, Paul wrote:-----Original Message----- From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> Sent: 10 May 2021 20:43 To: Michael Brown <mbrown@xxxxxxxxxxxxxxxx>; paul@xxxxxxx Cc: paul@xxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; wei.liu@xxxxxxxxxx;Durrant,Paul <pdurrant@xxxxxxxxxxxx> Subject: RE: [EXTERNAL] [PATCH] xen-netback: Check for hotplug-status existence before watching On Mon, May 10, 2021 at 08:06:55PM +0100, Michael Brown wrote:If you have a suggested patch, I'm happy to test that it doesn't reintroduce the regression bug that was fixed by this commit. Why is that missing? We're going behind the back of the toolstack to do the unbind and bind so why should we expect it to re-execute a hotplug script? In this case, I'm not sure what is the proper way. If I restart xendriverdomain service (I do run the backend in domU), it properly executes hotplug script and the backend interface gets properly configured. But it doesn't do it on its own. It seems to be related to device "state" in xenstore. The specific part of the libxl is backend_watch_callback(): https://github.com/xen-project/xen/blob/master/tools/libs/light/libxl_device.c#L1664 ddev = search_for_device(dguest, dev); if (ddev == NULL && state == XenbusStateClosed) { /* * Spurious state change, device has already been disconnected * or never attached. */ goto skip; } else if (ddev == NULL) { rc = add_device(egc, nested_ao, dguest, dev); if (rc > 0) free_ao = true; } else if (state == XenbusStateClosed && online == 0) { rc = remove_device(egc, nested_ao, dguest, ddev); if (rc > 0) free_ao = true; check_and_maybe_remove_guest(gc, ddomain, dguest); } In short: if device gets XenbusStateInitWait for the first time (ddev == NULL case), it goes to add_device() which executes the hotplug script and stores the device. Then, if device goes to XenbusStateClosed + online==0 state, then it executes hotplug script again (with "offline" parameter) and forgets the device. If you unbind the driver, the device stays in XenbusStateConnected state (in xenstore), and after you bind it again, it goes to XenbusStateInitWait. It don't think it goes through XenbusStateClosed, and online stays at 1 too, so libxl doesn't execute the hotplug script again. This is pretty key. The frontend should not notice an unbind/bind i.e. there should be no evidence of it happening by examining states in xenstore (from the guest side). Paul Some solution could be to add an extra case at the end, like "ddev != NULL && state == XenbusStateInitWait && hotplug-status != connected". And make sure xl devd won't call the same hotplug script multiple times for the same device _at the same time_ (I'm not sure about the async machinery here). But even if xl devd (aka xendriverdomain service) gets "fixes" to execute hotplug script in that case, I don't think it would work in backend in dom0 case - there, I think nothing watches already configured vif interfaces (there is no xl devd daemon in dom0, and xl background process watches only domain death and cdrom eject events). I'm adding toolstack maintainers, maybe they'll have some idea... In any case, the issue is not calling the hotplug script, responsible for configuring newly created vif interface. Not kernel waiting for it. So, I think both commits should still be reverted.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |