[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Watchdog and live migration
>>> On 16.03.12 at 15:58, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote: > On Fri, 2012-03-16 at 14:34 +0000, Jan Beulich wrote: >> >>> On 16.03.12 at 13:32, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote: >> > I think this is a bug, moving to xen-devel@ and CCing the drivers author >> > (Hi Jan). >> > >> > On Fri, 2012-03-16 at 12:29 +0000, Wouter de Geus wrote: >> >> * Ian Campbell <Ian.Campbell@xxxxxxxxxx> [2012-03-16 10:55:34 +0000]: >> >> >> >> > > In the domU a /dev/watchdog exists, but I never touch it myself. >> >> > >> >> > Does your distro start a watchdogd for you? >> >> > >> >> > Even if it did then it should obviously continue to poke the watchdog >> >> > even after a migration, but it might provide some hints. >> >> >> >> Nope, it doesn't. >> >> As an experiemtn I tried sending a 'V' to /dev/watchdog just after >> >> migration, which seems to be the magic byte to stop the watchdog >> >> according to the linux kernel docs. >> >> >> >> (according to Documentation/watchdog/watchdog-api.txt) - >> >> "If a driver supports "Magic Close", the driver will not disable the >> >> watchdog unless a specific magic character 'V' has been sent to >> >> /dev/watchdog just before closing the file." >> >> >> >> Surprisingly, the domU seemed to stay alive after doing this. >> >> So it would appear that somehow the watchdog is triggered after migration. >> >> No idea why though. >> > >> > drivers/watchdog/xen_wdt.c:xen_wdt_resume() unconditionally calls >> > xen_wdt_start(). Shouldn't it only do this if the wdt is active? >> >> Could you give the patch below a try? >> >> Jan >> >> --- a/drivers/watchdog/xen_wdt.c >> +++ b/drivers/watchdog/xen_wdt.c >> @@ -297,11 +297,19 @@ static void xen_wdt_shutdown(struct plat >> >> static int xen_wdt_suspend(struct platform_device *dev, pm_message_t state) >> { >> - return xen_wdt_stop(); >> + typeof(wdt.id) id = wdt.id; > > typeof here is a bit odd. But I want to match the original field's type. >> + int rc = xen_wdt_stop(); >> + >> + wdt.id = id; >> + >> + return rc; >> } >> >> static int xen_wdt_resume(struct platform_device *dev) >> { >> + if (!wdt.id) >> + return 0; > > Can't you check is_active instead and avoid having to play tricks in > xen_wdt_suspend to preserve a non-0 wdt.id when the watchdog is active? I first thought of this too, but is_active doesn't represent whether a watchdog is actually engaged - it merely says whether the watchdog device is currently open (but watchdog setup itself may have failed). Looking at the code for this again, I see another problem though: is_active gets cleared in xen_wdt_release() even when a release was not expected (similar to how iTCO_wdt, pcwd_pci, or pcwd_usb behave, but imo buggy nevertheless), or when xen_wdt_stop() failed. Jan >> + wdt.id = 0; >> return xen_wdt_start(); >> } >> >> >> _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |