[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer()

>>> On 26.09.17 at 20:11, <dario.faggioli@xxxxxxxxxx> wrote:
> On Tue, 2017-09-26 at 08:59 -0600, Jan Beulich wrote:
>> > > > On 15.09.17 at 20:01, <dario.faggioli@xxxxxxxxxx> wrote:
>> > @@ -326,7 +326,17 @@ void stop_timer(struct timer *timer)
>> >          return;
>> >  
>> >      if ( active_timer(timer) )
>> > -        deactivate_timer(timer);
>> > +    {
>> > +        /*
>> > +         * If the timer is expired already, 'call' the softirq
>> > handler to
>> > +         * execute it (it will leave it inactive after that). If
>> > not, just
>> > +         * deactivate it.
>> > +         */
>> > +        if ( timer->expires <= NOW() )
>> > +            cpu_raise_softirq(timer->cpu, TIMER_SOFTIRQ);
>> > +        else
>> > +            deactivate_timer(timer);
>> > +    }
>> Isn't it reasonable to expect current behavior, i.e. the timer not
>> raising any events other than those already in progress?
> Well, if we managed to get to here, with the timer that is both:
> - active,
> - with its expiry time in the past,
> it means there is an event that *is* in progress right now (i.e., we're
> stopping the timer on the path that goes from the interrupt that raised
> TIMER_SOFTIRQ, and the timer softirq handler).
> So, basically, I'm trying to achieve exactly what you call reasonable:
> servicing an event which is in progress. :-)

I'm afraid I don't understand - if the processing of the timer is
already in progress, why would you need to raise another
softirq? Wouldn't it suffice to simply skip deactivate_timer() then?
This raising of the softirq is what made me imply that, perhaps
due to some minimal skew e.g. resulting from rounding, you
assume the interrupt did not fire yet, which I'd then call the
timer event not being in progress (yet).

> The alternative is that the event happened, but we somehow managed to
> miss running the timer handler for it, and we realize this only now,
> potentially a long time after the miss. This scenario, as far as I can
> tell, can't happen, but if it could/does, I'd still say running the
> handler late is better than not running it at all.

Well, as said - what's better may very well depend on the particular
use case. As long as it's not clearly written down what the intended
behavior is, both behaviors are acceptable imo.

In the end what I think I'm missing is a clear description of an actual
case where the current behavior causes breakage (plus of course
an explanation why the new behavior is unlikely to cause issues with
existing users).

>> Additionally I'm not convinced the changed actually does what you
>> want: What if NOW() becomes equal to timer->expires immediately
>> after you've managed to obtain its value, before execution makes
>> it into deactivate_timer()? IOW you're shrinking a time window which
>> (I think) you really mean to eliminate.
> Well, I certainly don't like the window to be there, and closing it
> would be ideal, IMO.
> However, in this patch, I'm addressing the specific situation of when
> we are stopping a timer for which an interrupt has already fired, the
> interrupt's ISR has already raised TIMER_SOFTIRQ, and yet we don't run
> the handler.
> Yes, if an interrupt is about to be raised, and/or it arrives _while_
> we are inside this function (after the check), or already in
> deactivate_timer(), we also end up not running the handler.
> I guess these can be seen as two aspects of the same problem, or as
> conceptually different issues, but whatever you consider them, getting
> rid of the former is something I consider an improvement.

It may be improvement, yes, but if there is
- no actual case breaking with the current code, I don't see
  the need for the change,
- an actual case breaking with the current code, it'll still break
  with the change as the window was merely shrunk.

>> Furthermore, taking both changes together: What if the same
>> you try to address in stop_timer() happens in set_timer()? Wouldn't
>> it then be only consistent to also require a timer even to fire for
>> the
>> old expiry value, before the new one is being set?
> Yes, personally, I think that, whenever it is that we figure out that
> we missed handling a timer interrupt, we should run it then.
> Unfortunately, for achieving this, e.g., in the set_timer() case, I
> don't see much alternatives to call execute_timer() right in there. But
> that would violate the current invariant that execute_timer() only run
> from the TIMER_SOFTIRQ handler... which is probably doable, but is at
> the same time unrelated to the problem I'm tackling here, and I would
> rather do it in a dedicated series.

Indeed forcing the handler to run in the set_timer() case is more
difficult, yet as said - if there's a window to be concerned about,
the concern should be regarding all such windows, or none of
them imo.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.