[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/watchdog: Identify which domain watchdog fired
On 13.02.2025 17:46, Andrew Cooper wrote: > When a watchdog fires, the domain is crashed and can't dump any state. > > Xen allows a domain to have two separate watchdogs. Therefore, for a > domain running multiple watchdogs (e.g. one based around network, one > for disk), it is important for diagnostics to know which watchdog > fired. > > As the printk() is in a timer callback, this is a bit awkward to > arrange, but there are 12 spare bits in the bottom of the domain > pointer owing to its alignment. > > Reuse these bits to encode the watchdog id too, so the one which fired > is identified when the domain is crashed. > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > --- > CC: Anthony PERARD <anthony.perard@xxxxxxxxxx> > CC: Michal Orzel <michal.orzel@xxxxxxx> > CC: Jan Beulich <jbeulich@xxxxxxxx> > CC: Julien Grall <julien@xxxxxxx> > CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> > CC: Stefano Stabellini <sstabellini@xxxxxxxxxx> You'll eventually need a scheduler maintainer's ack, yet you didn't Cc any of them. > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -1534,12 +1534,17 @@ long vcpu_yield(void) > > static void cf_check domain_watchdog_timeout(void *data) > { > - struct domain *d = data; > + /* > + * The data parameter encodes the watchdog id in the low bits of > + * the domain pointer. > + */ > + struct domain *d = _p((unsigned long)data & PAGE_MASK); > + unsigned int id = (unsigned long)data & ~PAGE_MASK; > > if ( d->is_shutting_down || d->is_dying ) > return; > > - printk("Watchdog timer fired for domain %u\n", d->domain_id); > + printk("Watchdog timer %u fired for %pd\n", id, d); And apriori knowledge will be required to associate the number with whichever watchdog it was (network or disk in your example)? (No question that logging the number is better than not doing so.) > @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d) > d->watchdog_inuse_map = 0; > > for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ ) > - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0); > + { > + void *data = d; > + > + BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE); > + > + /* > + * For the timer callback parameter, encode the watchdog id in > + * the low bits of the domain pointer. > + */ > + init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, > 0); > + } This way we'll be promising to ourselves that we're never going to alter the allocation mechanism of struct domain instances, always requiring them to have at least page alignment. If someone wanted to change that, they'll have a hard time spotting the logic here. Sadly I have no good suggestion towards improving the situation. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |