[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer()

On Tue, 2017-09-26 at 08:59 -0600, Jan Beulich wrote:
> > > > On 15.09.17 at 20:01, <dario.faggioli@xxxxxxxxxx> wrote:
> > --- a/xen/common/timer.c
> > +++ b/xen/common/timer.c
> > @@ -217,7 +217,7 @@ static inline void activate_timer(struct timer
> > *timer)
> >      timer->status = TIMER_STATUS_invalid;
> >      list_del(&timer->inactive);
> >  
> > -    if ( add_entry(timer) )
> > +    if ( add_entry(timer) || timer->expires <= NOW() )
> >          cpu_raise_softirq(timer->cpu, TIMER_SOFTIRQ);
> >  }
> I'm not convinced of this change - it's unrelated to what title and
> description say, 
You're right. This should either be mentioned, or dropped (or live in
another patch).

> > @@ -326,7 +326,17 @@ void stop_timer(struct timer *timer)
> >          return;
> >  
> >      if ( active_timer(timer) )
> > -        deactivate_timer(timer);
> > +    {
> > +        /*
> > +         * If the timer is expired already, 'call' the softirq
> > handler to
> > +         * execute it (it will leave it inactive after that). If
> > not, just
> > +         * deactivate it.
> > +         */
> > +        if ( timer->expires <= NOW() )
> > +            cpu_raise_softirq(timer->cpu, TIMER_SOFTIRQ);
> > +        else
> > +            deactivate_timer(timer);
> > +    }
> Isn't it reasonable to expect current behavior, i.e. the timer not
> raising any events other than those already in progress?
Well, if we managed to get to here, with the timer that is both:
- active,
- with its expiry time in the past,

it means there is an event that *is* in progress right now (i.e., we're
stopping the timer on the path that goes from the interrupt that raised
TIMER_SOFTIRQ, and the timer softirq handler).

So, basically, I'm trying to achieve exactly what you call reasonable:
servicing an event which is in progress. :-)

The alternative is that the event happened, but we somehow managed to
miss running the timer handler for it, and we realize this only now,
potentially a long time after the miss. This scenario, as far as I can
tell, can't happen, but if it could/does, I'd still say running the
handler late is better than not running it at all.

> Additionally I'm not convinced the changed actually does what you
> want: What if NOW() becomes equal to timer->expires immediately
> after you've managed to obtain its value, before execution makes
> it into deactivate_timer()? IOW you're shrinking a time window which
> (I think) you really mean to eliminate.
Well, I certainly don't like the window to be there, and closing it
would be ideal, IMO.

However, in this patch, I'm addressing the specific situation of when
we are stopping a timer for which an interrupt has already fired, the
interrupt's ISR has already raised TIMER_SOFTIRQ, and yet we don't run
the handler.

Yes, if an interrupt is about to be raised, and/or it arrives _while_
we are inside this function (after the check), or already in
deactivate_timer(), we also end up not running the handler.

I guess these can be seen as two aspects of the same problem, or as
conceptually different issues, but whatever you consider them, getting
rid of the former is something I consider an improvement.

I certainly can try to state the problem better, and describe the
situation more clearly in the changelog.

> Furthermore, taking both changes together: What if the same
> you try to address in stop_timer() happens in set_timer()? Wouldn't
> it then be only consistent to also require a timer even to fire for
> the
> old expiry value, before the new one is being set?
Yes, personally, I think that, whenever it is that we figure out that
we missed handling a timer interrupt, we should run it then.

Unfortunately, for achieving this, e.g., in the set_timer() case, I
don't see much alternatives to call execute_timer() right in there. But
that would violate the current invariant that execute_timer() only run
from the TIMER_SOFTIRQ handler... which is probably doable, but is at
the same time unrelated to the problem I'm tackling here, and I would
rather do it in a dedicated series.

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.