Xen project Mailing List

Re: [Xen-devel] [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer()

To: "Dario Faggioli" <dario.faggioli@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 27 Sep 2017 02:20:13 -0600

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, TimDeegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 27 Sep 2017 08:20:28 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 26.09.17 at 20:11, <dario.faggioli@xxxxxxxxxx> wrote: > On Tue, 2017-09-26 at 08:59 -0600, Jan Beulich wrote: >> > > > On 15.09.17 at 20:01, <dario.faggioli@xxxxxxxxxx> wrote: >> > @@ -326,7 +326,17 @@ void stop_timer(struct timer *timer) >> > return; >> > >> > if ( active_timer(timer) ) >> > - deactivate_timer(timer); >> > + { >> > + /* >> > + * If the timer is expired already, 'call' the softirq >> > handler to >> > + * execute it (it will leave it inactive after that). If >> > not, just >> > + * deactivate it. >> > + */ >> > + if ( timer->expires <= NOW() ) >> > + cpu_raise_softirq(timer->cpu, TIMER_SOFTIRQ); >> > + else >> > + deactivate_timer(timer); >> > + } >> >> Isn't it reasonable to expect current behavior, i.e. the timer not >> raising any events other than those already in progress? >> > Well, if we managed to get to here, with the timer that is both: > - active, > - with its expiry time in the past, > > it means there is an event that *is* in progress right now (i.e., we're > stopping the timer on the path that goes from the interrupt that raised > TIMER_SOFTIRQ, and the timer softirq handler). > > So, basically, I'm trying to achieve exactly what you call reasonable: > servicing an event which is in progress. :-) I'm afraid I don't understand - if the processing of the timer is already in progress, why would you need to raise another softirq? Wouldn't it suffice to simply skip deactivate_timer() then? This raising of the softirq is what made me imply that, perhaps due to some minimal skew e.g. resulting from rounding, you assume the interrupt did not fire yet, which I'd then call the timer event not being in progress (yet). > The alternative is that the event happened, but we somehow managed to > miss running the timer handler for it, and we realize this only now, > potentially a long time after the miss. This scenario, as far as I can > tell, can't happen, but if it could/does, I'd still say running the > handler late is better than not running it at all. Well, as said - what's better may very well depend on the particular use case. As long as it's not clearly written down what the intended behavior is, both behaviors are acceptable imo. In the end what I think I'm missing is a clear description of an actual case where the current behavior causes breakage (plus of course an explanation why the new behavior is unlikely to cause issues with existing users). >> Additionally I'm not convinced the changed actually does what you >> want: What if NOW() becomes equal to timer->expires immediately >> after you've managed to obtain its value, before execution makes >> it into deactivate_timer()? IOW you're shrinking a time window which >> (I think) you really mean to eliminate. >> > Well, I certainly don't like the window to be there, and closing it > would be ideal, IMO. > > However, in this patch, I'm addressing the specific situation of when > we are stopping a timer for which an interrupt has already fired, the > interrupt's ISR has already raised TIMER_SOFTIRQ, and yet we don't run > the handler. > > Yes, if an interrupt is about to be raised, and/or it arrives _while_ > we are inside this function (after the check), or already in > deactivate_timer(), we also end up not running the handler. > > I guess these can be seen as two aspects of the same problem, or as > conceptually different issues, but whatever you consider them, getting > rid of the former is something I consider an improvement. It may be improvement, yes, but if there is - no actual case breaking with the current code, I don't see the need for the change, - an actual case breaking with the current code, it'll still break with the change as the window was merely shrunk. >> Furthermore, taking both changes together: What if the same >> you try to address in stop_timer() happens in set_timer()? Wouldn't >> it then be only consistent to also require a timer even to fire for >> the >> old expiry value, before the new one is being set? >> > Yes, personally, I think that, whenever it is that we figure out that > we missed handling a timer interrupt, we should run it then. > > Unfortunately, for achieving this, e.g., in the set_timer() case, I > don't see much alternatives to call execute_timer() right in there. But > that would violate the current invariant that execute_timer() only run > from the TIMER_SOFTIRQ handler... which is probably doable, but is at > the same time unrelated to the problem I'm tackling here, and I would > rather do it in a dedicated series. Indeed forcing the handler to run in the set_timer() case is more difficult, yet as said - if there's a window to be concerned about, the concern should be regarding all such windows, or none of them imo. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.